Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Do not use SENSING_TIME for Temporal/RangeDateTime field #50

Merged
merged 2 commits into from
Feb 6, 2025

Conversation

ceholden
Copy link
Collaborator

@ceholden ceholden commented Feb 6, 2025

Background

When running HLS-VI jobs for Landsat data products I encountered an error during the STAC metadata generation,

Traceback (most recent call last):
  File "/usr/local/bin/vi_generate_stac_items", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/hls_vi/generate_stac_items.py", line 321, in main
    version=args.version,
  File "/usr/local/lib/python3.6/site-packages/hls_vi/generate_stac_items.py", line 301, in create_item
    item = cmr_to_item(hls_vi_metadata, endpoint, version)
  File "/usr/local/lib/python3.6/site-packages/hls_vi/generate_stac_items.py", line 266, in cmr_to_item
    item_datetime = datetime.datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S.%fZ")
  File "/usr/lib64/python3.6/_strptime.py", line 565, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib64/python3.6/_strptime.py", line 362, in _strptime
    (data_string, format))
ValueError: time data '2025-02-04T14:47:18.1297510Z' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'

This was caused by trying to parse the SENSING_TIME from the (non-VI) HLS version of the *cmr.xml. I can't quite grok the origin of the bug here, but the metadata in the HDF has 1 extra digit (7 instead of 6) in the %f part of the datetime stamp. We already fixed this as part of the HLS metadata creation code here,
https://github.com/NASA-IMPACT/hls-metadata/blob/df1133d29a13426728ec2f7db644cd6b6af7e014/metadata_creator/metadata_creator.py#L283C41-L295

So this PR simply avoids the buggy SENSING_TIME definition because that issue is unique to Landsat (Sentinel-2 is fine). This way we don't have to conditionally fix for Landsat what has already been fixed upstream. This value seems safe because we're using datetime.strftime to write the value in hls-landsat-tile, but I could be missing some complication (e.g., that we tried to resolve in #35)

To try to further confirm I pulled ~8,000 (and counting) of the Landsat *cmr.xml from our prod forward production bucket and inspected the Temporal/RangeDateTime/BeginningDateTime field. All of these fields were formatted nicely as %Y-%m-%dT%H:%M:%S.%fZ

How I did it

  • I deleted the code that updated the Temporal/RangeDateTime field in the HLS-VI version of the CMR.xml file because we inherit the basics from the HLS product and don't need to override that.
  • With that removed, I deleted the (very nicely tested) parse_sensing_time code as cleanup

How you can test it

The authoritative way is via the container,

make test

but this is quicker if you have a local dev setup,

pytest tests/tests_vi.py

@ceholden ceholden marked this pull request as ready for review February 6, 2025 20:37
Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Why did we originally think we needed to do this? (facepalm)

@ceholden ceholden merged commit f3892ba into main Feb 6, 2025
1 check passed
@ceholden ceholden deleted the fix/rangedatetime-avoid-buggy-sensing_time branch February 6, 2025 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants