-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider splitting up Simple CQL into multiple conformance classes #579
Comments
@philvarner we have already been discussing removing the spatial and temporal operators for simple CQL altogether and simply having two classes (spatial operators, temporal operators) where all these operators reside. This was motivated by the fact that features already provides bbox and date-time for simple spatial and temporal filtering so there is no use in duplicating those capabilities in "simple" CQL. |
I agree with the logic of having the most 'basic' level enabling simple comparisons. That's one that seems like it could almost be in 'core', or at least is the clear 'next step', and shouldn't force people to do a whole lot more. I think the key is to get the 'groupings' right. I believe as Phil laid them out it'll be 10 total classes, the five core plus functions, arithmetic expressions, arrays, enhanced spatial and enhanced temporal. I could see whittling that down some, if we thought ten was a lot. Put logical in basic comparisons, and just have one 'spatial' and one 'temporal'. I think I do like 'enhanced comparison' as its own - there's a lot of extras in there that take more work. Spatial I think I do like having 'intersects' and 'enhanced spatial operators' separated out. Intersects get you actual geometries vs just the core, and many spatial backends don't implement the full suite of spatial operators. In STAC's query language we just had 'intersects' (above what we got from features API). Temporal I honestly still don't understand, but perhaps should try to dig in. But it seems like the basic comparisons should handle time? I don't really understand what anyinteracts does, but if basic comparisons handle time then there doesn't seem to be much value in having just one temporal operation as its own class. I guess I don't see much that'd be wrong about having 9 or 10 conformance classes. I think temporal is the one I'd consider just doing one class of temporal operators, but I also don't fully understand it. I do like this makes for an easier 'on ramp'. I would love us to consider actually breaking the spec into smaller pieces, so we don't present people with a 99 page document when they first hear about CQL. I likely have some funded hours to be able to help with document work if people are up for that (or I can focus my time on the 'guide'/'intro' texts to accompany the core spec text). |
I would prefer to also split the spatial and temporal operators into basic (intersects, anyinteracts) and Enhanced. For spatial operators, one practical reason is that Elasticsearch only supports intersects, so if they were all in a single spatial class, someone using Elastic could not implement it. For temporal, I think most use cases only need anyinteracts, and the other ones just add complexity that isn't necessary for many implementations. I would also argue against there being "no use" for intersects in simple CQL -- I think this will be used commonly in scientific applications. bbox is obviously restricted to a rectangle, and that doesn't work well for many of the scientific use cases i've encountered -- for example, several different small areas or points over a large geographic areas, or a complex shape whose bounding box has significantly more area that the shape itself.
sounds good to me! |
anyinteracts is effectively "do these two datetimes or intervals have any intersection with each other", and is effectively the time equivalent to spatial intersects. It's what the existing |
Then that would seem to me to indicate that a basic class with just 'anyinteracts' isn't so valuable since you can accomplish that with |
@cholmes correct ... which is why in the SWG we discussed just removing both intersect and anyinteracts from SimpleCQL and putting those operators into the Extended Spatial Operators and Extended Temporal Operators classes (and then dropping "Extended"). In the old OGC Filter specification we got around this be simply letting a server list which operators the server supported .... see here. Its starting to sound like this idea might be usefully translated into the OGC APIs since no matter how we decide to partition the operators, someone is not going to be happy! ;) |
The operation is the same, but you can't compose them with logical operators in core |
^^ that to me seems like the important part of CQL -- since it's the only place where this kind of expressiveness is available (at least in STAC API spec) i don't think it's bad if specific combinations of parameters happen to duplicate functionality elsewhere. Multiple conformance classes also makes sense to me -- as already talked about with ElasticSearch there are going to be queries that are easier / more difficult to support in different settings, and letting people build their conformance a la carte will be nicer than the alternatives I think. I don't have much of a horse in the race of how finely things are sliced -- Franklin will probably implement all of them, but I still think it's valuable not to have to. |
I think I prefer us to actually think through the best grouping, as I think we can get to groupings that make most people happy, and I see value in chunks of functionality. This can also help implementors figure out what to prioritize. |
Ah, gotcha. Cool, makes sense to me then. |
I prefer groups/classes over listing each operator separately. Like @cholmes and @philvarner I also think there's value in intersects being separate from rest of the spatial operators as it's the most commonly used and also supported by different backends plus it works with geometries not just envelopes. |
Meeting 2021-06-21: We do not want the conformance classes too fine grained (this did not work well in Filter Encoding with the filter capabilities). We have already moved Question during the discussion: Can |
Proposal for the CQL2 conformance classes based on the discussion:
|
Could you explain what CQL2 is? I think that this proposal doesn't account for datastores that don't support all the spatial and temporal predicates, and that's there's a need for classes that only support INTERSECTS and ANYINTERACTS but not any of the other predicates. |
Also, to clarify "IS NULL" will be part of "BasicSQL"? |
I would prefer to have "ILIKE" in Advanced Comparison Operators rather than UPPER()/LOWER() functions, so that an implementation can entirely ignore parsing functions in expressions. |
Meeting 2021-07-05: general agreement that the breakdown makes sense. @philvarner - regarding your comments/questions: @pvretano will respond to the first comment. Regarding IS NULL, yes it is the idea to have it in BasicCQL, it is seems to us to be an important basic test. Regarding the approach to LIKE (function or ILIKE etc, this is discussed in #541 and related issues and we should continue that discussion there). |
@philvarner Part 3 was derived from the CQL that was defined in old Catalog standard and for a long time Part 3 was conformant to CQL from the old Catalog standard so it was safe to use the term "CQL" to refer to either. However, with all the recent changes, the two CQLs are diverging quite a bit and so the SWG felt that we should refer to the CQL from Part 3 as CQL2 to avoid any confusion. @philvarner with regard to supporting INTERSECTS and ANYINTERACTS, the feeling of the SWG was that Part 1 already defines bbox and datetime that take care of basic spatial and temporal filtering and there is no use in duplicating that capability in the BasicCQL conformance class. There might be a discussion point for further sub-dividing the spatial and temporal predicate conformance classes into basic spatial/temporal and then the rest. That is something we will need to discuss at the next SWG meeting. |
Thanks for the details!
I am in favor of dividing them. I'm going to break these up differently in the STAC API Filter Extension conformance classes for now, but hopefully we'll re-align at some point in the future. The two main issues I see are (1_ that bboxes are too course-grained for many applications, for example, where scientists have a precise shapefile of the area of interest, and (2) the bbox or datetimes cannot be composed with "OR" when there are several related areas of interest or when trying to query for period-over-period data (e.g., the same month in multiple years) Thanks! |
reorganise conformance classes Merge as agreed in the meeting today. The issue #579 will remain open while we resolve the open discussion topics.
Sorry I didn't get back to you about this before the meeting. I have the same concern with having T_INTERSECTS in what is now the Enhanced Temporal Operators as I had for S_INTERSECTS. Some databases like Elasticsearch don't support all of those operators, so then that implementation can't support any conformance class for temporal operators. |
Thanks @philvarner. But does Elasticsearch support intervals and if the answer is "no", does it really need the temporal operators or are the comparison operators like |
After talking with @pvretano earlier at the STAC API, it seems that one thing I missed was that the basic comparison operators support datetime comparison? So if I only implement Comparison operators conformance class, I can write a CQL statement (assuming If this is the case, it's not clear in the spec, and the grammar seems to indicate this is not allowed. |
@philvarner just checked the BNF and the text encoding allows using standard comparison operators to evaluate simple temporal predicates involving time instants. There seems to be a bug in the JSON encoding, though, which I will fix. I'll create a PR to fix the JSON encoding problem and add more text to clarify this use case. Please stand by. |
@philvarner OK, I patched the bug in the CQL2 JSON schema to allow standard comparison operators to be used to evaluate simple temporal predicates involving time instants. The Basic CQL2 clause already includes a permission for this. I added a reciprocal note to the "Temporal Operators" clause pointing back to the relevant sections in the Basic CQL2 clause. I hope that is sufficient clarity. Let me know if it is not ... |
Reopened, because the approach to case-insensitive comparisons is still open (#579 (comment)):
|
@cportele oops! Didn't mean to close this. Thanks for reopening. |
I can't remember if I commented on lower/upper or not, but I would prefer this be a separate conformance class because including it in the IN/BETWEEN/LIKE class would require implementers to implement function parsing (even if just for these two named function) just to advertise IN/BETWEEN/LIKE support. This makes it a much bigger piece of work and I worry could harm adoption. |
Ah, I see this wasn't present in the draft for CQL Text, but was added last month 7c3e18c I should probably go through the latest version again and start using that as a reference. Thanks! |
I just created a PR to realign STAC API - Filter Extension with this https://github.com/radiantearth/stac-api-spec/pull/202/files The classes are now aligned with the exception that we do not require upper/lower as part of Advanced Comparison Ops. Though, I am reconsidering that. Now that I would prefer that an implementor didn't need to implement function parsing to support this class, but I think, ultimately, it's not worth having this minor difference between STAC API and OAFeat CQL2, so I'll defer to whatever y'all finalize. |
Meeting 2021-08-30: We will separate the case-insensitive comparison support into a separate requirements class. This satisfies the preference from the STAC community and also helps to separate this capability where we had lots of discussions into a separate module. What is still open is the details on how we provide the capability (still a @pvretano action). @cportele will introduce the new requirements class in the document. |
Thanks! |
closes #579, the open issue how to implement case-insensitive comparisons will be transferred to a new issue.
In the context of STAC Items, the biggest piece missing from the Part 1 "items" endpoint is a formalized mechanism for filtering based on fields in the Feature/Item's Properties. Implementations can support arbitrary query parameters to support this, but there's no "queryables" mechanism to discover what fields can be filtered on. For example, a common filter criteria is cloud cover, and there's no well-defined way to express a simple
cloud_cover <= 10
. An implementer may desire to only support this one predicate, and is satisfied with using the existingdatetime
andbbox
for spatial and temporal filtering.However, with the current Simple CQL conformance class, an implementer would be forced to support more than just simple comparison operations (e.g., gt, lt, etc.) and would also need to support all the logical operators and the more complex comparison operators, IN, LIKE, BETWEEN and IS NULL. (They would not actually need to implement INTERSECTS or ANYINTERACTS if they don't advertise any queryables of spatial or temporal types) This makes conforming to the CQL spec much more difficult that the needs of the implementation require, and will likely lead to implementers doing something proprietary or, even worse, only partially implementing CQL but advertising that it is conformant!
Additionally, I don't like the use of "Simple" for the class, as it is still a pretty complex language. I think a word like "basic" would be more appropriate here. Also, this class has "CQL" in the name, whereas as none of the other "operator" classes do.
I'd like to propose to divide Simple CQL into these separate conformance classes:
=,<,<=. >, >=
IN, LIKE, BETWEEN, IS NULL
and, or, not
(these could also go into a "basic logical and comparison operators" class, but it feels a little cleaner to split them out)intersects
anyinteracts
While this does introduce some overhead, in that now an implementation has to advertise 5 classes instead of 1, I think there is a large benefit in allowing implementations with limited needs to properly implement a subset of these conformance classes, instead of simply not conforming to CQL at all. I think that this will result in both more robust implementations and provide an "on-ramp" for an implemenation to support more and more of the CQL conformance classes over time.
Thanks for your consideration of this!
The text was updated successfully, but these errors were encountered: