Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding OSCD dataset #233

Merged
merged 77 commits into from
Nov 19, 2021
Merged

Adding OSCD dataset #233

merged 77 commits into from
Nov 19, 2021

Conversation

iejMac
Copy link
Contributor

@iejMac iejMac commented Nov 10, 2021

No description provided.

@ghost
Copy link

ghost commented Nov 10, 2021

CLA assistant check
All CLA requirements met.

@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Nov 10, 2021
Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @iejMac, thanks for the contribution!

Can you add docs for this dataset in docs/api/datasets.rst? Let me know if you have any questions about any of the remaining TODOs in the file.

test_oscd.py Outdated Show resolved Hide resolved
torchgeo/datasets/oscd.py Outdated Show resolved Hide resolved
torchgeo/datasets/oscd.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the style tests are failing. Most of these can be solved by simply running black . and isort ., but see https://torchgeo.readthedocs.io/en/latest/user/contributing.html#linters for a full set of instructions on this.

tests/datasets/test_oscd.py Show resolved Hide resolved
@iejMac iejMac marked this pull request as draft November 12, 2021 00:57
@iejMac
Copy link
Contributor Author

iejMac commented Nov 19, 2021

I've slimmed down the dummy test dataset to only files that are used to create a dataset of length 1.
Should I add the OSCDDataModule in this PR or maybe this should be in a separate "OSCD Benchmark" PR after this one is merged? It seems it might be easier to implement it in that context

@iejMac iejMac marked this pull request as ready for review November 19, 2021 09:39
@isaaccorley
Copy link
Collaborator

Yeah a separate PR should be fine.

adamjstewart
adamjstewart previously approved these changes Nov 19, 2021
@adamjstewart
Copy link
Collaborator

This looks good to me, just need to fix the mypy/pytest issues.

@calebrob6 calebrob6 merged commit 0b8a846 into microsoft:main Nov 19, 2021
@adamjstewart adamjstewart added this to the 0.2.0 milestone Nov 20, 2021
),
}

md5 = "7383412da7ece1dca1c12dc92ac77f09"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dataset downloads 3 different zip files but uses the same MD5 for all of them. This is a bug, right? Going to try to fix this in #329 since I need to regenerate the data anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm really curious about is the fact that the tests still pass even with the wrong checksum. Even if I change the zip file entirely it still passes the tests.

@adamjstewart adamjstewart added utilities Utilities for working with geospatial data and removed utilities Utilities for working with geospatial data labels Jan 2, 2022
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* OSCD: initial template

* updating download pattern

* package: including OSCD in torchgeo.datasets

* download: adapting download method to OSCD dataset + adding simple test for debugging

* _load_files method: temporary implementation

* OCSD: minimum working example, needs plenty improvement

* adding OSCD to docs

* Moving test to appropriate location

* OSCD: remove sort_bands and use utils.sort_sentinel2_bands

* Using rasterio instead of tifffile

* remove useless import

* style changes

* fix: style

* Developing tests for OSCD dataset

* Updating dataset description

* change name

* style fixes

* fixing mypy errors

* style fixes

* cast to string to fix typing errors

* style change

* isort fix

* remove TODO

* adding dataset for testing

* change len

* check if sum is concatdataset

* isort fix

* fixing some issues + correct md5 in dataset

* closing rasterio file handles

* removing some TODO's

* transitioning to fake data

* mypy fix attempt

* set fake data md5

* flake8 fix

* starting plot method

* updating plot method

* no predictions for now

* fixing style errors

* add testing for plot

* making some changes to fake testing data

* full coverage

* Use RGB channels in the plot function

* adding shape tests in test_getitem

* remove features and add to description

* fixing some things

* transitioning to authors dataset link

* No need to change file names + adapt test dataset

* adapting tests to new data format

* Update docs/api/datasets.rst

Co-authored-by: Adam J. Stewart <[email protected]>

* closing plot at end of terst

* add versionadded

* style fixes + indentation fixes

* style fixes

* forgot the .zip

* fix zipfile name

* temporary fix for flake8

* Add link to docs

* forgot to adjust this

* changing flake8 solve

* slimming down the test dataset

* removing imgs_x files which aren't needed for current testing but might be in the future

* Revert "removing imgs_x files which aren't needed for current testing but might be in the future"

This reverts commit cfbf26c.

* nevermind, this was the issue

* trying to remove these once again

* adding band choosing functionality

* removing double code

* removing more double code

* flake8 fix

* adding one more training sample to dummy dataset and testing split

* typing numpy array

* back to this

* Fixing tests and mypy

Co-authored-by: Caleb Robinson <[email protected]>
Co-authored-by: Adam J. Stewart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants