Add CaBuAr dataset #2235

DarthReca · 2024-08-19T10:38:10Z

This PR extends the ChaBuD dataset introduced in #1259.

It is based on data presented in both CaBuAr: California Burned Areas dataset for delineation and ChaBuD challenge. Train and Validation are taken from CaBuAr, while the Test is from ChaBuD.

The files are hosted on HuggingFace.

These are some samples taken from the dataset:

DarthReca · 2024-08-19T10:39:22Z

@microsoft-github-policy-service agree

adamjstewart · 2024-08-19T12:49:26Z

Missing tests for the new datamodule. Usually we put these in the trainer tests, but we don't yet have a trainer for change detection (feel free to work on this if you want). For now, you'll have to create a new tests/datamodules test.

DarthReca · 2024-08-19T13:56:49Z

I have added the datamodule test, but the project coverage is decreasing for some reason, while the patch one is ok.

Ruff mentions some issues with the cabuar dataset file, but no changes were made, and the last run was ok.

For the datasets test, should I add another importorskip to the datamodule test?

adamjstewart · 2024-08-19T14:57:44Z

ruff: I just merged Ruff: enable ruff-specific rules #2218 which adds extra checks for certain coding patterns to avoid. You can rebase/merge main to get the tests to fail locally. I can comment directly on the lines with the suggested changes if that helps.
datasets: Copy the pytest.importorskip to your datamodules tests as well
coverage: No idea what's going on here, probably a random bug. If you still see the issue after you push a couple more commits I'll investigate.

torchgeo/datasets/cabuar.py

DarthReca · 2024-08-19T15:49:25Z

Everything fine. Do you suggest keeping both implementations (ChaBuD and CaBuAr)?

adamjstewart · 2024-08-19T15:54:35Z

I see no reason not to keep both. Even if the newer dataset is more useful, someone might want to reproduce the results on the previous dataset for comparison against older papers.

docs/api/datasets.rst

docs/api/non_geo_datasets.csv

tests/data/cabuar/data.py

tests/datamodules/test_cabuar.py

tests/datasets/test_cabuar.py

torchgeo/datamodules/cabuar.py

torchgeo/datasets/cabuar.py

adamjstewart · 2024-08-21T15:13:03Z

torchgeo/datasets/cabuar.py

+
+    `CaBuAr <https://huggingface.co/datasets/DarthReca/california_burned_areas>`__
+    is a dataset for Change detection for Burned area Delineation and part of
+    the splits are used for the ChaBuD ECML-PKDD 2023 Discovery Challenge.


Depending on how much code is the same, it may be possible to either subclass ChaBuD:

from .chabud import ChaBuD class CaBuAr(ChaBuD): ...

or create a shared abstract base class. We could either have both datasets in a single file or a different dataset in each file. It's borderline since it's more like a ChaBuD v2, but it has a unique name.

Originally the data was proposed in CaBuAr, then we extended the dataset for the challenge. However, the challenge page is not working anymore.

I can probably do the reverse if that seems appropriate: I can make ChaBuD an extension of CaBuAr since the latter deals already with more files and subsets.

adamjstewart · 2024-08-27T13:37:13Z

@DarthReca do you have time to address these change requests? Trying to finalize the 0.6.0 release this week.

DarthReca · 2024-08-27T14:09:29Z

I will start working today to resolve any issues.

adamjstewart

Everything looks fine to me now. Can you add an example figure created by the plot method to the PR description? This will help validate that the plot method works correctly. Feel free to make one dataset a subclass of the other if you want, but I'm also fine keeping things the way they are. Would like to merge this by the end of the day so we can prepare the release.

DarthReca · 2024-08-28T13:32:48Z

I have added some plots. If this is fine for now, everything should be ready.

DarthReca added 2 commits August 17, 2024 19:24

🆕 Added CaBuAr dataset

4ddbf94

🆕 Added CaBuAr datamodule

82f1c35

github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Aug 19, 2024

🔨 Added CaBuAr datamodule test

182ebd9

adamjstewart reviewed Aug 19, 2024

View reviewed changes

torchgeo/datasets/cabuar.py Outdated Show resolved Hide resolved

adamjstewart reviewed Aug 19, 2024

View reviewed changes

torchgeo/datasets/cabuar.py Outdated Show resolved Hide resolved

adamjstewart reviewed Aug 19, 2024

View reviewed changes

torchgeo/datasets/cabuar.py Outdated Show resolved Hide resolved

DarthReca added 2 commits August 19, 2024 17:15

Merge branch 'main' into main

d932fc3

🔨 Corrected CaBuAr typing and datamodule test

438b318

DarthReca marked this pull request as ready for review August 19, 2024 15:46

DarthReca marked this pull request as draft August 19, 2024 15:47

DarthReca marked this pull request as ready for review August 19, 2024 15:59

adamjstewart requested changes Aug 21, 2024

View reviewed changes

adamjstewart added this to the 0.6.0 milestone Aug 21, 2024

DarthReca added 2 commits August 27, 2024 14:56

🔨 updated test, corrected docs, minor fixes to dataset and datamodule

a3d9d23

🔨 CaBuAr test fixes

7303c48

adamjstewart approved these changes Aug 28, 2024

View reviewed changes

adamjstewart merged commit ccc314c into microsoft:main Aug 28, 2024
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CaBuAr dataset #2235

Add CaBuAr dataset #2235

DarthReca commented Aug 19, 2024 •

edited

Loading

DarthReca commented Aug 19, 2024

adamjstewart commented Aug 19, 2024

DarthReca commented Aug 19, 2024 •

edited

Loading

adamjstewart commented Aug 19, 2024

DarthReca commented Aug 19, 2024

adamjstewart commented Aug 19, 2024

adamjstewart Aug 21, 2024

DarthReca Aug 27, 2024

DarthReca Aug 27, 2024

adamjstewart commented Aug 27, 2024

DarthReca commented Aug 27, 2024

adamjstewart left a comment

DarthReca commented Aug 28, 2024 •

edited

Loading

Add CaBuAr dataset #2235

Add CaBuAr dataset #2235

Conversation

DarthReca commented Aug 19, 2024 • edited Loading

DarthReca commented Aug 19, 2024

adamjstewart commented Aug 19, 2024

DarthReca commented Aug 19, 2024 • edited Loading

adamjstewart commented Aug 19, 2024

DarthReca commented Aug 19, 2024

adamjstewart commented Aug 19, 2024

adamjstewart Aug 21, 2024

Choose a reason for hiding this comment

DarthReca Aug 27, 2024

Choose a reason for hiding this comment

DarthReca Aug 27, 2024

Choose a reason for hiding this comment

adamjstewart commented Aug 27, 2024

DarthReca commented Aug 27, 2024

adamjstewart left a comment

Choose a reason for hiding this comment

DarthReca commented Aug 28, 2024 • edited Loading

DarthReca commented Aug 19, 2024 •

edited

Loading

DarthReca commented Aug 19, 2024 •

edited

Loading

DarthReca commented Aug 28, 2024 •

edited

Loading