Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR 1: Point outside of projection domain #893

Closed
FlorisCalkoen opened this issue Nov 9, 2022 · 5 comments
Closed

ERROR 1: Point outside of projection domain #893

FlorisCalkoen opened this issue Nov 9, 2022 · 5 comments
Labels
datasets Geospatial or benchmark datasets

Comments

@FlorisCalkoen
Copy link

FlorisCalkoen commented Nov 9, 2022

Description

I'm running also into ERROR 1: Point outside of projection domain when using creating a RasterDataset. The error occurs during a call torasterio.vrt.WarpedVRT.

Bit of context: I'm warping a collection of S2 tif files. For many I'm able to warp them without exception, but for some this exception occurs. See snippets in steps to reproduce to reproduce.

Unfortunately I'm not very familar with GDAL debugging. I hope the traceback below is sufficient to indicate where the issue might be coming from. I think it's related to rasterio._err.CPLE_AppDefinedError: Point outside of projection domain which points to /miniconda3/envs/torchgeo/lib/python3.10/site-packages/rasterio/_err.cpython-310-darwin.so on my machine?. While debugging I came accross this issue which might be related.

Tracking the warning I found that the message is raised in rasterio._warp which points on my machine I think to /Users/calkoen/miniconda3/envs/torchgeo/lib/python3.10/site-packages/rasterio/_warp.cpython-310-darwin.so:

The warning is raised here:

self._log(INFO, msg, args)  # where self is rasterio._warp
print(self)  # <Logger rasterio._warp (WARNING)> 
print(INFO)  # 20
print(msg) # Ignoring error: err=%r
print(args)  # (CPLE_AppDefinedError(3, 1, 'Point outside of projection domain'),)

The exception can be triggered by another exception:

Traceback (most recent call last):
  File "rasterio/_warp.pyx", line 768, in rasterio._warp.auto_create_warped_vrt
  File "rasterio/_err.pyx", line 178, in rasterio._err.exc_wrap
rasterio._err.CPLE_AppDefinedError: Point outside of projection domain

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name '_log' is not defined

Steps to reproduce

To download the data:

mkdir -p $HOME/tmp/coasttypes
cd $HOME/tmp/coasttypes
wget -O BOX_028_044_110_T18FWG_20200207T141731_B02_10m.tif https://s3.eu-central-1.amazonaws.com/floris.calkoen.open.data/coasttypes/BOX_028_044_110_T18FWG_20200207T141731_B02_10m.tif
wget -O BOX_029_371_99_T20FPH_20221002T135641_B02_10m.tif https://s3.eu-central-1.amazonaws.com/floris.calkoen.open.data/coasttypes/BOX_029_371_99_T20FPH_20221002T135641_B02_10m.tif
wget -O BOX_166_003_31_T34SEG_20220816T091559_B02_10m.tif https://s3.eu-central-1.amazonaws.com/floris.calkoen.open.data/coasttypes/BOX_166_003_31_T34SEG_20220816T091559_B02_10m.tif

To reproduce and visualize the plot using Python

import pathlib

import matplotlib.pyplot as plt
import rasterio  # rasterio.__version__=='1.3.3'
import rioxarray
import xarray as xr
from torchgeo.datasets import RasterDataset  # torchgeo.__version__=='0.4.0.dev0'

data_dir = pathlib.Path.home().joinpath("tmp", "coasttypes")


class Sentinel2(RasterDataset):
    filename_glob = "BOX_*_T*_B02_10m.tif"
    filename_regex = r"^(?P<transect>BOX_\d{3}_\d{3}_\d{1,3})_.{6}_(?P<date>\d{8}T\d{6})_(?P<band>B0[\d])"
    date_format = "%Y%m%dT%H%M%S"
    is_image = True
    separate_files = True
    all_bands = ["B02", "B03", "B04", "B08"]
    rgb_bands = ["B04", "B03", "B02"]


# when only file is present in the directory the warning doesn't happen.
Sentinel2(str(data_dir))

for fp in data_dir.iterdir():
    xr.open_dataset(
        fp,
        engine="rasterio",
    )["band_data"].squeeze().plot()
    plt.show()

Version

0.4.0.dev0

@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Nov 9, 2022
@adamjstewart
Copy link
Collaborator

I'm able to reproduce this issue, although the error message is different.

My first guess is that this is an issue with the way that our GeoDatasets currently require all data to be in a single CRS. Because all of your data is in UTM, and the UTM zones are so drastically different (18S, 20S, 34N), it may not be possible to warp an image from one of these UTM zones to another.

Unfortunately, #409 won't help with this. In order to build a single R-tree (which doesn't understand things like projection), all data must be stored in a single CRS. If we can't even project a bounding box from one CRS to another, it becomes impossible to create an R-tree using the CRS of one file.

The best workaround I can think of at the moment is to explicitly specify a CRS that covers the entire Earth:

from rasterio.crs import CRS

Sentinel2(str(data_dir), crs=CRS.from_epsg(4326))

This worked for me. I'll keep this in mind when working on #409, thanks for the bug report!

@FlorisCalkoen
Copy link
Author

FlorisCalkoen commented Nov 9, 2022

Great! I can confirm that also for me this workaround fixes the error. Thank you for your swift response.

@FlorisCalkoen
Copy link
Author

Small follow up.. do you know how this affects the samplers and dataset objects? Because now I have a RunTimeError

import pathlib

import matplotlib.pyplot as plt
import rasterio  # rasterio.__version__=='1.3.3'
import rioxarray
import xarray as xr
from rasterio import CRS
from torch.utils.data import DataLoader
from torchgeo.datasets import RasterDataset, stack_samples, unbind_samples  # torchgeo.__version__=='0.4.0.dev0'
from torchgeo.samplers import RandomGeoSampler  # torchgeo.__version__=='0.4.0.dev0'

data_dir = pathlib.Path.home().joinpath("tmp", "coasttypes")


class Sentinel2(RasterDataset):
    filename_glob = "BOX_*_T*_B02_10m.tif"
    filename_regex = r"^(?P<transect>BOX_\d{3}_\d{3}_\d{1,3})_.{6}_(?P<date>\d{8}T\d{6})_(?P<band>B0[\d])"
    date_format = "%Y%m%dT%H%M%S"
    is_image = True
    separate_files = True
    all_bands = ["B02", "B03", "B04", "B08"]
    rgb_bands = ["B04", "B03", "B02"]


# when only file is present in the directory the warning doesn't happen.
dataset = Sentinel2(str(data_dir), crs=CRS.from_epsg(4326))

sampler = RandomGeoSampler(dataset, size=50, length=100)
dataloader = DataLoader(dataset, sampler=sampler, collate_fn=stack_samples)
next(iter(dataloader))

Traceback:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In [2], line 30
     28 sampler = RandomGeoSampler(dataset, size=50, length=100)
     29 dataloader = DataLoader(dataset, sampler=sampler, collate_fn=stack_samples)
---> 30 next(iter(dataloader))

File ~/miniconda3/envs/torchgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py:628, in _BaseDataLoaderIter.__next__(self)
    625 if self._sampler_iter is None:
    626     # TODO(https://github.com/pytorch/pytorch/issues/76750)
    627     self._reset()  # type: ignore[call-arg]
--> 628 data = self._next_data()
    629 self._num_yielded += 1
    630 if self._dataset_kind == _DatasetKind.Iterable and \
    631         self._IterableDataset_len_called is not None and \
    632         self._num_yielded > self._IterableDataset_len_called:

File ~/miniconda3/envs/torchgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py:670, in _SingleProcessDataLoaderIter._next_data(self)
    669 def _next_data(self):
--> 670     index = self._next_index()  # may raise StopIteration
    671     data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    672     if self._pin_memory:

File ~/miniconda3/envs/torchgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py:618, in _BaseDataLoaderIter._next_index(self)
    617 def _next_index(self):
--> 618     return next(self._sampler_iter)

File ~/miniconda3/envs/torchgeo/lib/python3.10/site-packages/torch/utils/data/sampler.py:254, in BatchSampler.__iter__(self)
    252 batch = [0] * self.batch_size
    253 idx_in_batch = 0
--> 254 for idx in self.sampler:
    255     batch[idx_in_batch] = idx
    256     idx_in_batch += 1

File ~/miniconda3/envs/torchgeo/lib/python3.10/site-packages/torchgeo/samplers/single.py:144, in RandomGeoSampler.__iter__(self)
    137 """Return the index of a dataset.
    138 
    139 Returns:
    140     (minx, maxx, miny, maxy, mint, maxt) coordinates to index a dataset
    141 """
    142 for _ in range(len(self)):
    143     # Choose a random tile, weighted by area
--> 144     idx = torch.multinomial(self.areas, 1)
    145     hit = self.hits[idx]
    146     bounds = BoundingBox(*hit.bounds)

RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

@FlorisCalkoen
Copy link
Author

Following up on the RuntimeError above. This is caused by setting crs=CRS.from_epsg(4326) because the sampler requires the property RandomGeoSampler.areas, which cannot be calculated for a coordinate system (EPSG:4326), but can be calculated for a projection (EPSG:3857). So crs=CRS.from_epsg(3857) fixes it.

@calebrob6
Copy link
Member

Thanks for following up @FlorisCalkoen. I'm going to close this, but feel free to reach out if anything else comes up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

No branches or pull requests

3 participants