-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shift pixel coordinates of tiles using proper affine transformation #150
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Extends xarray with a special accessor for automated projection parsing! Package repository at https://github.com/fmaussion/salem. Also added required dependency joblib.
Patches 863d3be with a relatively easy fix in data_prep.ascii_to_xyz. [pyproj](https://github.com/pyproj4/pyproj) v2.2.0 now defines a CRS from string and does reprojection a bit differently than before in v2.1.3. See also: - [Release notes](https://github.com/pyproj4/pyproj/releases/tag/v2.2.0rel) - [Commits]pyproj4/pyproj@v2.1.3rel...v2.2.0rel)
Bumps [rasterio](https://github.com/mapbox/rasterio) from 1.0.13 to 1.0.23. Supersedes #117. Also show (a part of) failing tests in test_ipynb.ipynb that needs to be resolved in #150. - [Release notes](https://github.com/mapbox/rasterio/releases) - [Changelog](https://github.com/mapbox/rasterio/blob/master/CHANGES.txt) - [Commits](rasterio/rasterio@1.0.13...1.0.23)
Adjusting the bounding box positions of (some) tiles by half a pixel (roughly 500m). Fixed a poorly written unit test for the data_prep.get_window_bounds function introduced in ab0295e right before release v0.4.0, and refactored the function to make it work correctly with newer libraries. Strangely enough, after recreating the geojson tiles, the coordinate changes are only for the Operation Ice Bridge/CRESIS Basler and TO grids, and we still have exactly 2347 tiles! Specifically, we've switched from using xarray.open_rasterio to xarray.open_dataarray as xarray's rasterio backend fails to read NetCDF georeferencing information properly with rasterio >= 1.0.14 (see #84). The projection information is monkey-patched into the xarray grid using salem. Most important bit is to ensure that we go from xarray's centre-based pixel coordinate system to rasterio's corner-based pixel coordinate system (accomplished with dataset.salem.grid.corner_grid) via a manually set Affine transformation.
To make the data_prep.selective_tile more understandable, I've 'simplified' the doctest with a diagonal array, and actually used more geographically correct (corner-based) pixel bounds. Also updated test_data_prep.py to use the v0.7.0 release grid instead of v0.4.0. Issue with rasterio.open not having proper affine transformation on netcdf files solved via the most ludricrous method of all - importing xarray before rasterio... Will submit a bug report after this, but in the meantime, the code can stay mostly intact, phew! Next step is to possibly remove salem, and refactor the selective_tile function a bit more actually, by using xarray.open_rasterio and the xarray.DataArray.sel by xy slice method.
Fragments appear to be the same with 63b3ef1?!! Just reuploading to quilt (new hash is 4d3b5a17f63b35212b5a0a210d9b88059a9ff88ad2d5b2de15d9535f48002e13 at https://quiltdata.com/package/weiji14/deepbedmap), not sure if there might be some minor changes. Gave up on refactoring data_prep.get_window_bounds (salem still seems to be needed...) and data_prep.selective_tile (lazy to figure out resampling of xarray grids). Anyways, time to get on with some neural network model training!
weiji14
added a commit
that referenced
this pull request
Jun 7, 2019
Closes #150 Shift pixel coordinates of tiles using proper affine transformation.
weiji14
added a commit
that referenced
this pull request
Jun 13, 2019
Oh the two weeks of background work just to come to this, pygmt.grdtrack and whatnot! Our srgan_train.get_deepbedmap_test_result function now evaluates the trained neural network model's predicted grid directly in memory instead of on a file! Lots of temporary file related boilerplate code removed, woohoo! There's been plenty of background work to get this in-memory grid to be georeferenced correctly (see #150, 7fd3345, 4a074d9), but it all ends up making the evaluation more accurate, and hopefully faster to run and cleaner to scale up. Note also that the deepbedmap.get_image_and_bounds function has been refactored and renamed to get_image_with_bounds. The previous bounds was using xarray's centre-based pixel coordinates when it should have returned rasterio-style corner-based pixel coordinates used by data_prep.selective_tile. This is resolved using good ol' salem, and there's some extra code to handle getting the bounds for multiple inputs (instead of relying on salem's .extent function). The bounds themselves are now stored as an attribute inside the groundtruth xarray grid. As we are returning an xarray.DataArray grid instead of a numpy.array, we can use xarray's .plot() method to plot merged groundtruth grids (that are not on a regular grid) in a less funny way. There is also an 'indexers' parameter introduced to enable manually getting bounding boxes exactly divisible by 4, a quirky requirement of DeepBedMap... The hardcoded bounding box view of Antarctica used in our deepbedmap.feature integration test has been updated in alignment with all the changes above (pixel offsets, manual crops, etc). New 'weiji14/deebedmap/model/test' tiles have been uploaded to quilt, and the new quilt hash to use is df0d28b24283c642f5dbe1a9baa22b605d8ae02ec1875c2edd067a614e99e5a4. Also patching 4a074d9 to fix data_prep.selective_tile's masking not handling fp16 NaN conversions as the Synthetic HighRes geotiff was acting up, and remove NaN checks after gapfilling as it trips up the big DeepBedMap tiller. What else? Phew!
weiji14
added a commit
that referenced
this pull request
Jun 13, 2019
Oh the two weeks of background work just to come to this, pygmt.grdtrack and whatnot! Our srgan_train.get_deepbedmap_test_result function now evaluates the trained neural network model's predicted grid directly in memory instead of on a file! Lots of temporary file related boilerplate code removed, woohoo! There's been plenty of background work to get this in-memory grid to be georeferenced correctly (see #150, 7fd3345, 4a074d9), but it all ends up making the evaluation more accurate, and hopefully faster to run and cleaner to scale up. Note also that the deepbedmap.get_image_and_bounds function has been refactored and renamed to get_image_with_bounds. The previous bounds was using xarray's centre-based pixel coordinates when it should have returned rasterio-style corner-based pixel coordinates used by data_prep.selective_tile. This is resolved using good ol' salem, and there's some extra code to handle getting the bounds for multiple inputs (instead of relying on salem's .extent function). The bounds themselves are now stored as an attribute inside the groundtruth xarray grid. As we are returning an xarray.DataArray grid instead of a numpy.array, we can use xarray's .plot() method to plot merged groundtruth grids (that are not on a regular grid) in a less funny way. There is also an 'indexers' parameter introduced to enable manually getting bounding boxes exactly divisible by 4, a quirky requirement of DeepBedMap... The hardcoded bounding box view of Antarctica used in our deepbedmap.feature integration test has been updated in alignment with all the changes above (pixel offsets, manual crops, etc). New 'weiji14/deebedmap/model/test' tiles have been uploaded to quilt, and the new quilt hash to use is df0d28b24283c642f5dbe1a9baa22b605d8ae02ec1875c2edd067a614e99e5a4. Also patching 4a074d9 to fix data_prep.selective_tile's masking not handling fp16 NaN conversions as the Synthetic HighRes geotiff was acting up, and remove NaN checks after gapfilling as it trips up the big DeepBedMap tiller. What else? Phew!
4 tasks
weiji14
added a commit
that referenced
this pull request
Sep 2, 2019
Towards even better pixel registration at the expense of more computing cost! Patches #150 properly. The xyz_to_grid function has been giving us gridline node registered grids since forever, because that's `gmt surface`'s default setting. Now converting it to pixel node registration to align better with rasterio and other geospatial packages. Also using simple slicing for our high resolution tiles to avoid interpolating to NaNs at the edges! Total tile counts have increased from 3379 to 4028, O.O, mostly from the Basler grid around the Siple Coast region. New training tiles updated on Quilt with hash af86cf135ffe5ed9f78fc65231e3aa0bfc90f45e33b6bccda9f08f392c090113. Though `surface` does have an `-r` setting to set pixel node registration directly, I looked at it and it gave strange diagonal strips for 2007tx.nc, so nope, we'll use `grdsample` instead, the only fault being we lose data lineage/provenance when checking the grid using `gmt grdinfo`. Hopefully the half pixel ghost doesn't haunt us ever again. Also note to self, to write up wrappers for `gmt blockmedian` and `gmt grdsample` to reduce boilerplate code in xyz_to_grid function. Reason for doing this was because the more precise selective_tile script (since e7936db) was so exact, that it realized our bounding boxes was outside of the (gridline registered) groundtruth grids and gave us NULL values! This problem didn't surface itself until I was trying to train the neural network, and was strangely getting NaN values in the discriminator loss after just one epoch, no matter what hyperparameters I used.
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug 🪲
Something isn't working
data 🗃️
Pull requests that update input datasets
enhancement ✨
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Details matter, even if it's only half a pixel (especially if that pixel is ~250m in length)! This ambitious pull request deals with the mess that goes with converting between xarray's centre-based pixel coordinates and rasterio's corner-based pixel coordinates.
The problem sort of surfaced itself in #88, but I thought it was a bug introduced from rasterio 1.0.13 to 1.0.14 (which I have tried several times unsuccessfully to pin down and fix behind the scenes). Turns out that the unit test for data_prep.get_window_bounds code here was wrong all along! One of the bounds' output
maxy
(36.5) was greater than the y-range (1-33)!!Also notice that the extra 0.5 in 36.5 is actually not 'correct' for rasterio's corner-based pixel coordinate system. A
minx
of 4.5 andmaxx
of 36.5 means there is a half-pixel shift to the right of where we want our bounding box to lie. This extra 0.5 is because we are reading the netcdf raster using xarray, that uses a centre-based pixel coordinate system, and then passing the task off to rasterio...So the raster's origin pixel reference needs to be updated when we switch between xarray and rasterio, which can be done by setting the correct affine transformation. Ideally, xarray/netcdf grids would have some way of storing projection information, but since it's a complicated work-in-progress matter, we'll use salem in the meantime to do the xarray-to-rasterio coordinate shift.
Fixes #84 (and #88, #90, #98, #102, #117...).
Might help with resolving #58.
References: