-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add eurocrops data module. #1869
Conversation
It is based on NAIPChesapeakeDataModule which splits bounding box of dataset into 1/2 train, 1/4 val, and 1/4 test. This may not be the best way to train an actual model.
Hi @favyen2, could you use the new splitting function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename everything from eurocrops_sentinel2 to sentinel2_eurocrops
I am not able to get this to work without some changes from #1889 like setting Sentinel-2 test data size to 128 instead of 36, so I will wait for that to be merged first since that will make it easier. |
#1889 has now been merged, feel free to rebase and copy-n-paste whatever you want from that data module. |
…ocrops_datamodule
It is based on NAIPChesapeakeDataModule which splits bounding box of dataset into 1/2 train, 1/4 val, and 1/4 test.
This may not be the best way to train an actual model. I think it is more natural to either split by country, or to randomly assign each large grid cell (e.g. 4096x4096 pixel) to train/val/test and then sample within those grid cells. But I wasn't sure how to split by country since VectorDataset automatically detects all the files, or to assign large grid cell since there's no sampler that can take multiple large bounding boxes and sample patches within them.