Add Sentinel-1 Subpackage #84

scottyhq · 2021-03-29T10:36:22Z

As described in #82 this adds an initial subpackage for sentinel-1 SAR data. I'm currently only working with a derived public dataset (https://registry.opendata.aws/sentinel-1-rtc-indigo/), so this PR mainly adds a create_item function for that dataset. But I think the framework will help to add GRD, SLC, and other datasets in the future.

I've tried to roughly follow the framework in the sentinel2 subpackage, and match the STAC 'sentinel:' properties from https://registry.opendata.aws/sentinel-2-l2a-cogs/ since both datasets use the MGRS grid, and hopefully this will help with interoperability.

Happy to refine this with input @matthewhanson @lossyrob !

stactools_sentinel1/stactools/sentinel1/rtc_metadata.py

scottyhq · 2021-03-31T00:31:36Z

elaborating on @kylebarron 's comment on footprint accuracy, below is a QGIS screenshot that illustrates the partial extent of an RTC product (bright foreground image), (covering a MGRS grid square (pink), GRD frame footprints (gray), GRD product with nodata (transparent+black). The lateral nodata of the frame footprints will vary with geographic location, in this case ~5km.

(https://sentinel-s1-rtc-indigo.s3.us-west-2.amazonaws.com/tiles/RTC/1/IW/12/S/YJ/2021/S1A_20210105_12SYJ_DSC/Gamma0_VV.tif)

also more information on GRD nodata here (scroll to bottom) https://asf.alaska.edu/how-to/data-recipes/geocode-sentinel-1-with-qgis-3-x/

matthewhanson · 2021-03-31T01:57:31Z

I spent some time looking into this for Sentinel-1 a while back and used rasterio to get the footprints. I also tried using overviews to generate the footprints so it would be faster and wouldn't require reading as much of the file, but it really worked quite a bit better using the full resolution file.

The resulting geometry tends to be very large and way too big, so you have to simplify it. It's a balancing act to simplify it enough to you still have the detail needed while keeping the geometry size small.

You can see in the gifs below the difference between the provided geometry (blue) and the one calculated with the above code (red)

This is a typical image and the provided boundary overestimates the area cross-track, and underestimates it along-track.

This is a more unusual collect that is along the water where some of the water was masked out in the data, and we see the situation is even worse. The generated footprint in red is far better.

There might be edge cases where this would fail, thus before going operational I think it should be used to generate a few hundred boundaries to be visually inspected.

If the image were already downloaded or locally available, it's not that bad to run this processing and certainly worth it for the better footprint.

The code is here:
https://gist.github.com/matthewhanson/6be66c97c828acd1d39d8cbb97a0981e

This approach could also be used for Landsat: #83

vincentsarago · 2021-03-31T16:08:20Z

If the image were already downloaded or locally available, it's not that bad to run this processing and certainly worth it for the better footprint.

just make sure to simplify the geometry or you could have hundreds/thousands points ... which will result in a really big stac items. This is especially true for GRD data where you can see that pixels don't follow a straight line on the border.

Ref: https://github.com/vincentsarago/landsat-footprint/blob/master/landsat_footprint/scripts/cli.py#L113-L120

Edits: 🤦 just seeing that you have a default simplify options set to 0.005 in the script you shared.

Another note, is that you need to assume the nodata/mask value is correct in the data.

matthewhanson · 2021-04-01T08:00:06Z

@scottyhq Looking at your implementation, I see options for scale and precision but not simplify. Did you find that using the right combo of scale and precision precluded the need to simplify the geometry?

scottyhq · 2021-04-01T09:27:05Z

Thanks for the examples everyone! it does seem like there is some potential for a common extract_footprint function in stactools.core ...

I see options for scale and precision but not simplify. Did you find that using the right combo of scale and precision precluded the need to simplify the geometry?

My last commit has a slightly modified footprint version that i've been using (main difference being the use of convex_hull which i like because it guarantees the footprint geometry encompasses the pixels). I suppose you might still want to run simplify()... I'll plan to create a test static catalog with 10-100 items to test. I tend to use lat/lon decimal precisions of 4 or 5 (corresponding to ~10 to 1 m geocoding precision) since RTC has 20m posting. At the end of the day, it's STAC-api clients using these footprints to return data intersecting some query, right, so slight overestimates are OK? I have a hunch if you want to be as precise as possible with the GRD footprints because they span ~250km you might want to 1. warpedVRT to a mercator projection centered on the approx footprint center, 2. apply shapely functions, 3. transform points to lat/lon?

I have some other general questions, so also pinging @m-mohr to get feedback on stac-extensions. I think the RTC dataset would benefit from using a few other extensions that aren't yet implemented in pystac. In particular:

@matthewhanson is the idea with new sentinel2 catalogs to use the MGRS extension?
I'd like to specify NoData=0 for each asset, not sure where that goes... seems like https://github.com/stac-extensions/raster
Another thing that would nice to capture in this metadata is dtype and scaling (for local incident_angle.tif "Angle (degrees) is scaled and converted to UInt16 to save storage space. For example, 4500 is 45 degrees."

scottyhq · 2021-04-02T07:02:12Z

Pushed forward a little more today but haven't had time to look closely... test catalog is here https://github.com/scottyhq/sentinel1-rtc-stac

I'm able to run stac browse (which is so awesome by the way!) and navigate. I expect the COGs to render, but I'm seeing tracebacks in the logs such as :

(abbreviated)

tiles_1    | 172.19.0.1:59252 - "GET /cog/tiles/8/38/89?url=https://sentinel-s1-rtc-indigo.s3.us-west-2.amazonaws.com/tiles/RTC/1/IW/10/U/CU/2017/S1A_20170101_10UCU_ASC/Gamma0_VV.tif HTTP/1.1" 500
tiles_1    | [2021-04-02 06:55:03 +0000] [58] [ERROR] Exception in ASGI application
tiles_1    | Traceback (most recent call last):

tiles_1    |   File "/usr/local/lib/python3.8/site-packages/titiler/endpoints/factory.py", line 334, in tile
tiles_1    |     content = image.render(
tiles_1    |   File "/usr/local/lib/python3.8/site-packages/rio_tiler/models.py", line 262, in render
tiles_1    |     return render(self.data, self.mask, img_format=img_format, **kwargs)
tiles_1    |   File "/usr/local/lib/python3.8/site-packages/rio_tiler/utils.py", line 381, in render
tiles_1    |     dst.write(mask.astype(data.dtype), indexes=count + 1)
tiles_1    |   File "rasterio/_base.pyx", line 394, in rasterio._base.DatasetBase.__exit__
tiles_1    |   File "rasterio/_base.pyx", line 385, in rasterio._base.DatasetBase.close
tiles_1    |   File "rasterio/_io.pyx", line 2111, in rasterio._io.BufferedDatasetWriterBase.stop
tiles_1    |   File "rasterio/_err.pyx", line 215, in rasterio._err.exc_wrap_pointer
tiles_1    | rasterio._err.CPLE_NotSupportedError: PNG driver doesn't support data type Float32. Only eight bit (Byte) and sixteen bit (UInt16) bands supported.  
tiles_1    | PNG driver doesn't support data type Float32. Only eight bit (Byte) and sixteen bit (UInt16) bands supported.

@vincentsarago any suggestions there? I thought titiler automatically scaled Float32s to 0:255 ?

m-mohr · 2021-04-02T16:05:43Z

@scottyhq For your questions 2 and 3, both will soon be in the raster extension (per band). Currently the file extension has some of the fields, but those will be removed soon.
Not sure whether there are other questions I can help with?

vincentsarago · 2021-04-02T16:36:25Z

@vincentsarago any suggestions there? I thought titiler automatically scaled Float32s to 0:255 ?

@scottyhq no titiler doesn't do that, it's up to the client to tell the tiler what to do!

matthewhanson · 2021-04-02T16:40:29Z

@scottyhq I've tried convex hull and for our use case overestimating was not what we wanted because it could lead to extracting AOIs that were completely empty. Convex hull could overestimate a lot, such as in my second gif above.

If this were a general function would be good to have the option of taking the convex hull though, for some uses overestimating might be better.

For the grid info, yes it should use the new MGRS extension.

scottyhq · 2021-04-09T00:07:42Z

@scottyhq I've tried convex hull and for our use case overestimating was not what we wanted because it could lead to extracting AOIs that were completely empty. Convex hull could overestimate a lot, such as in my second gif above.

So far I've found convex_hull to be the best way to guarantee 1) valid data encompassed and 2) less than 100 points in footprint. Coastal areas with strange no-data regions over water lead to over-estimated footprints with convex hull (screenshot below), but over land the footprints end up being tight and simple. The jagged edges due to terrain correction, especially in high-relief mountainous areas create quite complex footprints even when using simplify(). So agreed, if there is a generic footprint-extraction function it would need to have several keyword options.

I'm pretty happy with how this is looking now with the test catalog in https://github.com/scottyhq/sentinel1-rtc-stac, as noted in #86 if you want to use stac browser locally with the amplitude images it's best to add an expression and rescaling: &expression=sqrt(b1)&rescale=0,1:

scottyhq · 2021-04-16T00:02:19Z

@matthewhanson if you have a chance to review, could you look this over? I suspect a follow-up PR adding support for S1 GRD or SLC could refactor things a bit, but for now this seems OK

gadomski

I haven't checked any of the bounds stuff yet, but if you can't get to it @matthewhanson lmk and I'll dig in and make my best guess on what looks good.

@scottyhq one question, do the VRTs in the test data directory need to have a .tif extension, or can they have .vrt? Just a little initially confusing to see the extension/filetype mismatch.

stactools_sentinel1/requirements.txt

stactools_sentinel1/setup.py

stactools_sentinel1/stactools/sentinel1/rtc_metadata.py

stactools_sentinel1/stactools/sentinel1/stac.py

Co-authored-by: Pete Gadomski <[email protected]>

scottyhq · 2021-05-26T10:48:14Z

finally figured out how to deploy a stac browser on github pages (https://scottyhq.github.io/sentinel1-rtc-stac) for the test catalog mentioned above. I think having a CI-deployed browser like this is mighty convenient for review of new packages! Would appreciate any other pointers on getting this merged.

gadomski

I took a glance at the bounds and they seemed reasonable to me. We can always tweak them later if it is necessary, but for now I'm approving to get this in.

scottyhq · 2021-05-26T22:27:23Z

CI tests timed out with ERROR: Cannot install stactools, stactools-sentinel1==0.1.4, stactools==0.1.5 and stactools[all]==0.1.5 because these package versions have conflicting dependencies. So I bumped the version to 0.1.5, but looks like it needs to be re-approved to run tests

scottyhq · 2021-05-26T22:42:17Z

@matthewhanson or @rbavery would love to get a second pair of eyes on the metadata fields from someone who has worked with S1 data, you can browse the test catalog here: https://scottyhq.github.io/sentinel1-rtc-stac.

gadomski · 2021-05-27T13:04:47Z

I think @matthewhanson is PTO today, so I'll wait for @rbavery's review before merging. Thanks for the fixup @scottyhq!

rbavery · 2021-05-27T18:16:32Z

@scottyhq @gadomski I might be missing something, but for this example scene why does the metadata field Polarizations list VV,HH but the Assets tab shows Gamma0 VV backscatter and Gamma0 VH backscatter?

rbavery · 2021-05-27T18:24:47Z

Also, as a newcomer to working with SAR, I'm uninformed when it comes to the differences between different RTC products. For example, the ASF Vertex publishes two RTC products, a version relying on GAMMA+SNAP and a version relying only on SNAP: https://hyp3.asf.alaska.edu/about#list-rtcs1tbx

I assume not all RTC is equal, from an email from @scottyhq it sounds like GAMMA RTC is preferred for time series analysis. So can we capture information about the quality/type of RTC in the metadata, as opposed to only listing RTC for the Product Type field?

These are my only two questions, everything else looks great.

scottyhq · 2021-05-27T23:34:54Z

I might be missing something, but for this example scene why does the metadata field Polarizations list VV,HH but the Assets tab shows Gamma0 VV backscatter and Gamma0 VH backscatter?

Good catch, I'll fix that. This also made me realize the STAC Browser URLs are unfortunately not 'human-friendly' right now (radiantearth/stac-browser#46) so linking to that issue.

I assume not all RTC is equal, from an email from @scottyhq it sounds like GAMMA RTC is preferred for time series analysis. So can we capture information about the quality/type of RTC in the metadata, as opposed to only listing RTC for the Product Type field?

sar:product_type in the STAC extension could be any string I believe, but is generally an abbreviated identifier. But your question identifies a critical need to link to original product documentation is somewhere in the STAC catalog. Right now under 'providers' there is a link to https://registry.opendata.aws/sentinel-1-rtc-indigo, and then there is a link on that page to methodology docs https://sentinel-s1-rtc-indigo-docs.s3-us-west-2.amazonaws.com/methodology.html

There will likely be even more software options for SAR in the coming years. It's not that there is preferred software, but that the processing approach (DEM, parameters, algorithms) may vary. Ideally the entire processing pipeline could be open-source (e.g. hyp3 + SNAP or ISCE), and at the very least the product documentation should capture the full workflow.

happy to wait on comments from @matthewhanson before merging.

matthewhanson · 2021-06-03T04:03:04Z

stactools_sentinel1/stactools/sentinel1/stac.py

+    granule_href: str,
+    asset: str = 'local_incident_angle.tif',
+    additional_providers: Optional[List[pystac.Provider]] = None,
+    include_grd_metadata: Optional[bool] = False,


I think it's important to link back to the original metadata files as assets, so I think the default here should be True

matthewhanson · 2021-06-03T04:25:35Z

Thanks for all this work and patience @scottyhq , it's looking really awesome. Looking forward to giving it a spin and indexing the dataset.

I left a comment, but now i'm not sure about it. The productInfo.json files aren't really for these Items, they are for the SLC data from which these were derived. So the proper behavior would be to include a link with rel=derived_from to point to the STAC Item from which it came from. But the SLC data have no STAC catalog.

The other link type to consider including is via, which should point to the metadata used to create the STAC metadata. In this case though it's the the tifs themselves that are the source of metadata. But is a lot of that metadata simply copied over from the productInfo.json file? If so, then I think it would be better to link to the productInfo.json file(s) with a via link and leave them off the assets.

Regarding the projection extension info - I find proj:bbox in that extension redundant. Everything we need to know can be determined from the geotransform, so it would be my preference to not include it.

scottyhq · 2021-06-07T20:48:51Z

Thanks for looking this over @matthewhanson ! My plan is to do one more pass late this week or next, re-generate the test catalog with more scenes, then hopefully get this merged.

The productInfo.json files aren't really for these Items, they are for the SLC data from which these were derived. So the proper behavior would be to include a link with rel=derived_from to point to the STAC Item from which it came from. But the SLC data have no STAC catalog.

I'll have to brush up on STAC link types which I don't know much about. Unfortunately for this case a lot of metadata is coming from the processing workflow (GRDs rather than SLCs for this data). I think the key is just to provide the list of GRD frame ids so that people can get at the original files and metadata if needed from ESA.

But the SLC data have no STAC catalog.

What's your opinion on using NASA CMR STAC here? The GRD and SLC frames have a mirrored long term archive at NASA and as a result minimal STAC metadata for the original data does exist at https://cmr.earthdata.nasa.gov/stac/ASF/collections/SENTINEL-1A_DP_GRD_HIGH.v1 . A relevant issue though is needing a straightforward mapping of the granule ids (nasa/cmr-stac#21 (comment)).

gadomski · 2021-06-11T18:42:27Z

Closing in favor of https://github.com/stactools-packages/sentinel1.

scottyhq added 2 commits March 29, 2021 21:14

initial pass at s1 subpackage

0bcadfd

formatting fixes

4d52698

kylebarron reviewed Mar 29, 2021

View reviewed changes

stactools_sentinel1/stactools/sentinel1/rtc_metadata.py Outdated Show resolved Hide resolved

kylebarron reviewed Mar 29, 2021

View reviewed changes

stactools_sentinel1/stactools/sentinel1/rtc_metadata.py Outdated Show resolved Hide resolved

kylebarron reviewed Mar 29, 2021

View reviewed changes

stactools_sentinel1/stactools/sentinel1/rtc_metadata.py Outdated Show resolved Hide resolved

This was referenced Mar 31, 2021

ASTER, Landsat 8 C2 L2, Sentinel 2 L2A #83

Merged

Add method for generating data footprints #85

Closed

modified footprint extraction

8fccbf5

add create_collection command

b26a370

scottyhq mentioned this pull request Apr 5, 2021

Allow more customization of 'stac browse' command #86

Closed

scottyhq added 2 commits April 8, 2021 23:47

update footprint extraction, formatting

0b99e38

Merge branch 'master' into sentinel1

7d678a4

Merge branch 'master' into sentinel1

f12b6ea

gadomski requested changes Apr 28, 2021

View reviewed changes

gadomski added the new subpackage label Apr 28, 2021

gadomski mentioned this pull request Apr 28, 2021

What should go in a subpackage's requirements.txt? #95

Closed

scottyhq and others added 3 commits April 29, 2021 09:14

Apply suggestions from code review

a759469

Co-authored-by: Pete Gadomski <[email protected]>

Merge branch 'master' into sentinel1

b5324c1

option to specify asset to read geotiff metadata from

4ffb716

gadomski mentioned this pull request May 11, 2021

Reorganize packaging #111

Merged

gadomski added this to the v0.1.6 milestone May 11, 2021

gadomski linked an issue May 11, 2021 that may be closed by this pull request

Add Sentinel 1 subpackage #82

Closed

scottyhq requested a review from gadomski May 26, 2021 11:09

Merge branch 'master' into sentinel1

7a1cf27

gadomski approved these changes May 26, 2021

View reviewed changes

bumpy to 0.1.5

05d1e6a

gadomski requested review from matthewhanson and removed request for matthewhanson May 27, 2021 13:03

gadomski requested a review from matthewhanson May 27, 2021 23:36

scottyhq and others added 2 commits May 28, 2021 09:53

note on GRD properties, fix HH -> HV

f81f252

Merge branch 'master' into sentinel1

29d72db

matthewhanson reviewed Jun 3, 2021

View reviewed changes

gadomski removed this from the v0.1.6 milestone Jun 9, 2021

gadomski mentioned this pull request Jun 11, 2021

Move all "new package" pull requests to their own repositories #134

Closed

3 tasks

gadomski closed this Jun 11, 2021

scottyhq mentioned this pull request Jun 11, 2021

Add link to original discussion of this package stactools-packages/sentinel1#1

Closed

This was referenced Jun 17, 2021

Add Sentinel 1 subpackage #82

Closed

Allow more customization of 'stac browse' command stactools-packages/browse#1

Open

scottyhq mentioned this pull request Sep 1, 2021

Merge with stactools-packages/sentinel1 stactools-packages/sentinel1-grd#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Sentinel-1 Subpackage #84

Add Sentinel-1 Subpackage #84

scottyhq commented Mar 29, 2021

scottyhq commented Mar 31, 2021

matthewhanson commented Mar 31, 2021

vincentsarago commented Mar 31, 2021 •

edited

Loading

matthewhanson commented Apr 1, 2021

scottyhq commented Apr 1, 2021

scottyhq commented Apr 2, 2021

m-mohr commented Apr 2, 2021 •

edited

Loading

vincentsarago commented Apr 2, 2021

matthewhanson commented Apr 2, 2021

scottyhq commented Apr 9, 2021

scottyhq commented Apr 16, 2021

gadomski left a comment

scottyhq commented May 26, 2021

gadomski left a comment

scottyhq commented May 26, 2021

scottyhq commented May 26, 2021

gadomski commented May 27, 2021

rbavery commented May 27, 2021

rbavery commented May 27, 2021 •

edited

Loading

scottyhq commented May 27, 2021

matthewhanson Jun 3, 2021

matthewhanson commented Jun 3, 2021

scottyhq commented Jun 7, 2021

gadomski commented Jun 11, 2021

Add Sentinel-1 Subpackage #84

Add Sentinel-1 Subpackage #84

Conversation

scottyhq commented Mar 29, 2021

scottyhq commented Mar 31, 2021

matthewhanson commented Mar 31, 2021

vincentsarago commented Mar 31, 2021 • edited Loading

matthewhanson commented Apr 1, 2021

scottyhq commented Apr 1, 2021

scottyhq commented Apr 2, 2021

m-mohr commented Apr 2, 2021 • edited Loading

vincentsarago commented Apr 2, 2021

matthewhanson commented Apr 2, 2021

scottyhq commented Apr 9, 2021

scottyhq commented Apr 16, 2021

gadomski left a comment

Choose a reason for hiding this comment

scottyhq commented May 26, 2021

gadomski left a comment

Choose a reason for hiding this comment

scottyhq commented May 26, 2021

scottyhq commented May 26, 2021

gadomski commented May 27, 2021

rbavery commented May 27, 2021

rbavery commented May 27, 2021 • edited Loading

scottyhq commented May 27, 2021

matthewhanson Jun 3, 2021

Choose a reason for hiding this comment

matthewhanson commented Jun 3, 2021

scottyhq commented Jun 7, 2021

gadomski commented Jun 11, 2021

vincentsarago commented Mar 31, 2021 •

edited

Loading

m-mohr commented Apr 2, 2021 •

edited

Loading

rbavery commented May 27, 2021 •

edited

Loading