-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale-MAE model #2057
Scale-MAE model #2057
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see changes in https://github.com/microsoft/torchgeo/pull/2052/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor renaming suggestions to make things more consistent with DOFA, and some major documentation improvement suggestions. I'm willing to help with both if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two thoughts:
- I wonder if we should move both of these to the new "Sensor-Agnostic" section because they technically work for (RGB-only) imagery from any sensor
- Since both of these have evaluation results on fMoW, can we add additional columns with those performance metrics (assuming they are comparable)? If we move them to "Sensor-Agnostic", we may need two tables, one for things evaluated on GEO-Bench and one for things evaluated on fMoW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also want to add a short summary or table of which sensor-agnostic models provide which features. For example, DOFA enables explicit dynamic spectral band support (via model arch) and implicit dynamic resolution (via training data) while Scale-MAE has no dynamic spectral band support (RGB-only) but explicit dynamic resolution support (via model arch). Not sure about GASSL, maybe only implicit dynamic resolution (via training data)? It's worth mentioning that neither have dynamic temporal resolution support (maybe Satlas does?). I'm planning on highlighting this in our release notes, so I can also write something up if needed. Something like:
"The following pre-trained models offer dynamic spatial (resolution), temporal (time span), and/or spectral (wavelength) support, either via their training data (implicit) or via their model architecture (explicit):"
Model | Spatial | Temporal | Spectral |
---|---|---|---|
DOFA | implicit | - | explicit |
GASSL | implicit | - | - |
Scale-MAE | explicit | - | - |
We could also optionally specify the range of resolutions/time spans/wavelengths that the model was pre-trained on. Just want to give users more feedback as to which model to choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we save this for a different PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with that, just don't let me forget before the release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open a PR after this one so we don't forget to finish it
Finally getting around to adding this.
Adds Scale-MAE model (ViT encoder only) and pretrained weights.
I've verified this reproduces KNN performance at different resolutions for UCMerced but will repeat for other datasets.
@RitwikGupta let me know if this looks good. I cleaned up some of the code a bit to work out of the box with our trainers (this required setting the res when initializing the model instead of dynamically but I think it should still be fine).
@calebrob6 lmk if you want to team up on this one.