-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RasterIO backend #1260
Add RasterIO backend #1260
Changes from 25 commits
1643ce5
067dedb
7906cfd
2e1b528
532c5d3
8bc3da3
77cc0ca
776cbd9
eb739de
3a394ae
061b8fd
c0962fa
7275ffa
51a60af
5155634
09196ee
e2b6786
228a5a3
4b57d1c
2d21b4b
c2cb927
f86507e
e1a5b31
7cb8baf
48c7268
70bd03a
4d49195
3f18144
3e3e6fb
1ae2d9b
955f6b9
7d8fe4d
223ce0c
6cf2ce9
2cd0386
9193a2b
4bf4b6a
c778948
4299957
fcdd894
1ca6e38
d5c964e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ dependencies: | |
- scipy | ||
- seaborn | ||
- toolz | ||
- rasterio | ||
- pip: | ||
- coveralls | ||
- pytest-cov |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ dependencies: | |
- scipy | ||
- seaborn | ||
- toolz | ||
- rasterio | ||
- pip: | ||
- coveralls | ||
- pytest-cov |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -318,6 +318,33 @@ def maybe_decode_store(store, lock=False): | |
return maybe_decode_store(store) | ||
|
||
|
||
def open_rasterio(filename, add_latlon=False): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would we prefer for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Now that I've played around a little bit, I think keeping the default to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the crs argument be passed along to
or
|
||
"""Open a file with RasterIO (experimental). | ||
|
||
This should work with any file that rasterio can open (typically: | ||
geoTIFF). The x and y coordinates are generated automatically out of the | ||
file's geoinformation. | ||
|
||
Parameters | ||
---------- | ||
filename : str | ||
path to the file to open | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here and below: please capitalize first letter and include period at end. |
||
add_latlon : bool, optional | ||
if the file ships with valid geoinformation, longitudes and latitudes | ||
can be computed and added to the dataset as non-dimension coordinates | ||
|
||
Returns | ||
------- | ||
dataset : Dataset | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why return a |
||
The newly created dataset. | ||
""" | ||
ds = conventions.decode_cf(backends.RasterioDataStore(filename)) | ||
if add_latlon: | ||
from ..core.utils import add_latlon_coords_from_crs | ||
ds = add_latlon_coords_from_crs(ds) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we detect an empty CRS here? Or are we happy with the current behavior (shown below)
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if the CRSError is as clear as we might like. This could be caught and rethrown with a slightly more informative message. |
||
return ds | ||
|
||
|
||
def open_dataarray(*args, **kwargs): | ||
"""Open an DataArray from a netCDF file containing a single data variable. | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
import numpy as np | ||
|
||
try: | ||
import rasterio | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Put this import inside the function where it's used below. That way you get a sensible error message if/when rasterio is not installed. |
||
except ImportError: | ||
rasterio = False | ||
|
||
from .. import Variable | ||
from ..core.utils import FrozenOrderedDict, Frozen, NDArrayMixin, is_scalar | ||
from ..core import indexing | ||
from ..core.pycompat import OrderedDict, suppress | ||
|
||
from .common import AbstractDataStore | ||
|
||
_rio_varname = 'raster' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use all capitals for module level constants, e.g., |
||
|
||
_error_mess = ('The kind of indexing operation you are trying to do is not ' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is pedantic, but most abbreviations I see for message is |
||
'valid on RasterIO files. Try to load your data with ds.load()' | ||
'first.') | ||
|
||
|
||
class RasterioArrayWrapper(NDArrayMixin): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You need to set an In this case, I would just use the base classes |
||
def __init__(self, ds): | ||
self.ds = ds | ||
self._shape = self.ds.count, self.ds.height, self.ds.width | ||
self._ndims = len(self.shape) | ||
|
||
@property | ||
def dtype(self): | ||
return np.dtype(self.ds.dtypes[0]) | ||
|
||
@property | ||
def shape(self): | ||
return self._shape | ||
|
||
def __getitem__(self, key): | ||
|
||
# make our job a bit easier | ||
key = indexing.canonicalize_indexer(key, self._ndims) | ||
|
||
# bands cannot be windowed but they can be listed | ||
bands, n = key[0], self.shape[0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Put this on two lines. Use more descriptive variable names, e.g., |
||
if isinstance(bands, slice): | ||
start = bands.start if bands.start is not None else 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
stop = bands.stop if bands.stop is not None else n | ||
if bands.step is not None and bands.step != 1: | ||
raise IndexError(_error_mess) | ||
bands = np.arange(start, stop) | ||
# be sure we give out a list | ||
bands = (np.asarray(bands) + 1).tolist() | ||
|
||
# but other dims can only be windowed | ||
window = [] | ||
squeeze_axis = [] | ||
for i, (k, n) in enumerate(zip(key[1:], self.shape[1:])): | ||
if isinstance(k, slice): | ||
start = k.start if k.start is not None else 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again use |
||
stop = k.stop if k.stop is not None else n | ||
if k.step is not None and k.step != 1: | ||
raise IndexError(_error_mess) | ||
else: | ||
if is_scalar(k): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can combine this with the previous line into |
||
# windowed operations will always return an array | ||
# we will have to squeeze it later | ||
squeeze_axis.append(i+1) | ||
k = np.asarray(k).flatten() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For clarity, I would make the scalar case separate and not use |
||
start = k[0] | ||
stop = k[-1] + 1 | ||
if (stop - start) != len(k): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is not guaranteed that |
||
raise IndexError(_error_mess) | ||
window.append((start, stop)) | ||
|
||
out = self.ds.read(bands, window=window) | ||
if squeeze_axis: | ||
out = np.squeeze(out, axis=squeeze_axis) | ||
return out | ||
|
||
|
||
class RasterioDataStore(AbstractDataStore): | ||
"""Store for accessing datasets via Rasterio | ||
""" | ||
def __init__(self, filename, mode='r'): | ||
|
||
with rasterio.Env(): | ||
self.ds = rasterio.open(filename, mode=mode) | ||
|
||
# Get coords | ||
nx, ny = self.ds.width, self.ds.height | ||
dx, dy = self.ds.res[0], -self.ds.res[1] | ||
x0 = self.ds.bounds.right if dx < 0 else self.ds.bounds.left | ||
y0 = self.ds.bounds.top if dy < 0 else self.ds.bounds.bottom | ||
x = np.linspace(start=x0, num=nx, stop=(x0 + (nx - 1) * dx)) | ||
y = np.linspace(start=y0, num=ny, stop=(y0 + (ny - 1) * dy)) | ||
|
||
self._vars = OrderedDict() | ||
self._vars['y'] = Variable(('y',), y) | ||
self._vars['x'] = Variable(('x',), x) | ||
|
||
# Get dims | ||
if self.ds.count >= 1: | ||
self.dims = ('band', 'y', 'x') | ||
self._vars['band'] = Variable(('band',), | ||
np.atleast_1d(self.ds.indexes)) | ||
else: | ||
raise ValueError('Unknown dims') | ||
|
||
self._attrs = OrderedDict() | ||
with suppress(AttributeError): | ||
for attr_name in ['crs']: | ||
self._attrs[attr_name] = getattr(self.ds, attr_name) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we document the type of the "crs" attribute? |
||
|
||
# Get data | ||
self._vars[_rio_varname] = self.open_store_variable(_rio_varname) | ||
|
||
def open_store_variable(self, var): | ||
if var != _rio_varname: | ||
raise ValueError('Rasterio variables are named %s' % _rio_varname) | ||
data = indexing.LazilyIndexedArray(RasterioArrayWrapper(self.ds)) | ||
return Variable(self.dims, data, self._attrs) | ||
|
||
def get_variables(self): | ||
return FrozenOrderedDict(self._vars) | ||
|
||
def get_attrs(self): | ||
return Frozen(self._attrs) | ||
|
||
def get_dimensions(self): | ||
return Frozen(self.dims) | ||
|
||
def close(self): | ||
self.ds.close() |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -489,3 +489,46 @@ def ensure_us_time_resolution(val): | |
elif np.issubdtype(val.dtype, np.timedelta64): | ||
val = val.astype('timedelta64[us]') | ||
return val | ||
|
||
|
||
def add_latlon_coords_from_crs(ds, crs=None): | ||
"""Computes the longitudes and latitudes out of the x and y coordinates | ||
of a dataset and a coordinate reference system (crs). If crs isn't provided | ||
it will look for "crs" in the dataset's attributes. | ||
|
||
Needs rasterIO to be installe. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. installed |
||
|
||
Note that this function could be generalized to other coordinates or | ||
for all datasets with a valid crs. | ||
""" | ||
|
||
from .. import DataArray | ||
|
||
try: | ||
from rasterio.warp import transform | ||
except ImportError: | ||
raise ImportError('add_latlon_coords_from_crs needs RasterIO.') | ||
|
||
if crs is None: | ||
if 'crs' in ds.attrs: | ||
crs = ds.attrs['crs'] | ||
else: | ||
raise ValueError('crs not found') | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could check for an empty CRS here (see prior comment) |
||
ny, nx = len(ds['y']), len(ds['x']) | ||
x, y = np.meshgrid(ds['x'], ds['y']) | ||
# Rasterio works with 1D arrays | ||
xc, yc = transform(crs, {'init': 'EPSG:4326'}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. silly question, but is the dst transform always There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or would we want to pass in the |
||
x.flatten(), y.flatten()) | ||
xc = np.asarray(xc).reshape((ny, nx)) | ||
yc = np.asarray(yc).reshape((ny, nx)) | ||
dims = ('y', 'x') | ||
ds['lon'] = DataArray(xc, dims=dims, | ||
attrs={'units': 'degrees_east', | ||
'long_name': 'longitude', | ||
'standard_name': 'longitude'}) | ||
ds['lat'] = DataArray(yc, dims=dims, | ||
attrs={'units': 'degrees_north', | ||
'long_name': 'latitude', | ||
'standard_name': 'latitude'}) | ||
return ds.set_coords(['lon', 'lat']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how picky reviews normally are here, but this could be swapped with
save_mfdataset
to maintain alphabetical order (pep8)