-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protection from large Coverage API requests #54
Comments
How would servers be able to evaluate whether a request was too big? |
For coverages that are regular grids, the response volume can often be computed from the bounding box size and the time period requested. For servers in the cloud, a response may also be too big for cost reasons, so the next step would be to estimate egress and processing charges. For multi-point coverages, this may be more difficult. |
API-Common Part 2 will allow a server to place a limit on the size of a response. If a response will exceed that limit, then the response is "paged", returning a partial response and the metadata necessary to retrieve the rest. With some adjustments, this existing capability can be applied to coverages. |
@cmheazel @clynnes I suggest that this is split into two issues: response too big, and asynchronous response. See async text in the EDR API |
@chris-little So far this discussion has not touched on async. However, I have captured the async issue under API-Common issue 231. |
SWG 2021-05-26: Agreed that service-metadata is the best place to describe API-wide limits in terms of a maximum number of cells and/or uncompressed megabytes of data that the server will accept to return. Furthermore we need a permission that the server can reject a client request (some standardized 400 code?) if the client is asking for too much data. |
413 Payload Too Large ~ https://datatracker.ietf.org/doc/html/rfc7231#section-6.5.11 |
@nmtoken As we discussed in the call, I believe 413 as defined might applies to the payload provided by the client, as opposed to the payload being returned with the response, though that should be clarified. |
@jerstlouis Sorry, wasn't on the call and missed the discussion. Now you say it, I agree, it could mean that too or either may apply. My next pick would be 400 Bad Request ~ https://datatracker.ietf.org/doc/html/rfc7231#section-6.5.1
|
As a related aside to this, I would like to suggest that we consider that for servers supporting the scaling conformance classes, that a default e.g. https://maps.ecere.com/ogcapi/collections/SRTM_ViewFinderPanorama/coverage?scaleFactor=1 returns an error, but |
not a good idea to change the behavior of Core in a way that is not intuitive and not consistent. And makes it more complicated to understand. Why not focus energy on a good set of examples, which would be extremely beneficial... |
@pebau while I agree on the examples, that is a separate topic. The reason I make this suggestion here is that the link to Principle 23 of the OGC API Web Guidelines mentions:
So I really like the idea of being able to return the downsampled coverage by default if the full resolution is too much for the server to process or the client expecting to receive, while saying I also believe it is part of the solution to the problem raised in this issue by @clynnes .
The specification now has conformance classes for
We should probably add this as a new requirement?
We were just discussing adding this to service metadata at the API level.
We were just discussing 400 is likely the proper error response for this.
Discussed in #66 -- multiple OGC API specifications are going to use the Prefer header to indicate that an async response is desired, and this could be defined in Common. |
@jerstlouis you commonly "like the idea" - let me note, for the records, that others do not necessarily like it (in this case: at least me). In my world, more technically based arguments tend to be used. |
@pebau I included the reason why I like the idea: to avoid a default behavior returning an error or returning more data than the client (or webcrawler) intended to request. I understand your opinion and concerns about the service consistency, I think it's a question of balancing perceived consistent service logic vs. the implications of having a default behavior which is normally not what is intended by users/clients. The webcrawler case is probably the best reason to go for this -- this was much less of a concern with WCS as the GetCoverage request link was not exposed in the same web-friendly way as it is with OGC API - Coverages. (and returning an error to indexing crawlers negatively affects SEO and thus Findability). |
@jerstlouis I do not share the idea that users would intend what is sketched (let aside that a complete specification is still missing), also with webcrawlers. If this is a use case, how is that solved by other tools and services, in the geo and other domains? Any investigation available? |
Admittedly we haven't actually had any real clients yet for our OGC API - Coverages implementation, but I can tell you that of all the people within our company that I got to test drive what's been implemented so far, they've all complained that our /coverage endpoint is broken because they get a "400 Bad Request" response. So in this respect it seems to me that having endpoints that actually work without having to append parameters to them would be a good thing. If a client wants to explicitly request the full-scale coverage, it always can do so by adding I don't feel strongly about this one way or another, and am happy to let the industry decide, but I just figured I'd share my own personal experience and opinion in this matter. |
SWG 2021-11-25: We agreed to include permissions and requirements along these lines: Permission: A server advertising conformance for the scaling conformance class MAY return a downsampled version of the coverage, if the client does not explicitly request Requirement: Even if the scaling conformance class is not supported, the server SHALL understand requests specifying If a client wants to page and the server supports the Coverage Tiles conformance class, that can be used and the client can negotiate e.g. its preferred TileMatrixSet to page/request in the most convenient way. |
just for the records, I am strongly against this. |
2022-01-12: Suggestion from @tomkralidis to have a boolean property in the metadata indicating that the default Motion to update the spec to include permissions to return downsampled data, introduce this boolean property, and requirement to accept |
Question regarding your suggestion
There is a permission now in place.
and a Note:
I am wondering now how to implement this suggestion, and having seconds thoughts about whether it is really necessary / useful. The only place we could include this metadata (e.g., My feeling, as per the note, is that any client requesting Adding more things to the collection description complicates things. In OGC API - Maps we already have this expectation that Thoughts about whether it is necessary or not to include something like |
@jerstlouis so to summarize:
That works for me, we should have some informative text on this behaviour (probably a similar in OAMaps while we're at it). |
For large datasets, such as satellite products covering long periods of time, it is easy to submit a synchronous coverage request that cannot be fulfilled within typical timeout limits. Even if a large request CAN be fulfilled, the user may be impacted by getting much more data volume than expected. There is likely no one solution to solve this for the diversity of users and servers. An assemblage of solutions might include the following capabilities:
The text was updated successfully, but these errors were encountered: