CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

Ziyang Gong^{1 ∗}, Zhixiang Wei^{2 ∗}, Di Wang^{3 ∗}, Xianzheng Ma³, Hongruixuan Chen⁴, Yuru Jia⁵⁶, Yupeng Deng¹, Zhenming Ji^{1 †}, Xiangwei Zhu^{1 †}, Naoto Yokoya⁴, Jing Zhang³, Bo Du³, Liangpei Zhang³

¹ Sun Yat-sen University, ² University of Science and Technology of China, ³ Wuhan University,

⁴ The University of Tokyo, ⁵ KU Leuven, ⁶ KTH Royal Institute of Technology

^∗ Equal contribution, ^† Corresponding author

🔥🔥🔥 News

[2024/11/14] We are releasing the benchmark collection. Click here to get illustrations of benchmarks!
[2024/11/06] The most checkpoints have been uploaded and you can access them in the huggingface badges.
The environment and inference steps please refer to the following installation. The inference codes and weights will be coming soon.
The benchmark collection in the paper is releasing and you can access it at here.
🎉🎉🎉 CrossEarth is the first VFM for Remote Sensing Domain Generalization (RSDG) semantic segmentation. We just release the arxiv paper of CrossEarth. You can access CrossEarth at here.

📑 Table of Content

Visualization
Environment Requirements
Inference steps
Training steps
Model Weights with Configs

Visualization

In Radar figure:

CrossEarth achieves SOTA performances on 23 evaluation benchmarks across various segmentation scenes, demonstrating strong generalizability.

In UMAP figures:

CrossEarth extracts features that cluster closely for the same class across different domains, forming well-defined groups in feature space, demonstrating its ability to learn robust, domain-invariant features.
Moreover, CrossEarth features exhibit high inter-class separability, forming unique clusters for each class and underscoring its strong representational ability to distingguish different categories.

Environment Requirements:

conda create -n CrossEarth -y
conda activate CrossEarth
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia -y
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0"
pip install xformers=='0.0.20' 
pip install -r requirements.txt
pip install future tensorboard

Inference steps:

First, download the model weights from the huggingface or Baidu Netdisk in the above badges. Notably, the checkpoints of dinov2_converted.pth and dinov2_converted_1024x1024.pth are needed for inference. Please download them and put them in the CrossEarth/checkpoints folder.

Second, change the file path in experiment config files (configs/base/datasets/xxx.py and configs/CrossEarth_dinov2/xxx.py), and run the following command to inference. (Take 512x512 inference as an example)

python tools/test.py configs/CrossEarth_dinov2/CrossEarth_dinov2_mask2former_512x512_bs1x4.py ./checkpoints/xxx.pth

Notably, save path of pseudo labels is in the experiment config file. When testing CrossEarth on different benchmarks, you also need to change the class number in CrossEarth_dinov2_mask2former.py file.

Training steps:

Coming soon.

Model Weights with Configs

Dataset	Benchmark	Model	Config	Log
ISPRS Potsdam and Vaihingen	P(i)2V	Potsdam(i)-source.pth	dg_potIRRG2RGB_512x512.py	-
-	P(i)2P(r)	Potsdam(i)-source.pth	dg_potIRRG2RGB_512x512.py	-
-	P(r)2P(i)	Potsdam(r)-source.pth	dg_potRGB2IRRG_512x512.py	-
-	P(r)2V	Potsdam(r)-source.pth	dg_potRGB2IRRG_512x512.py	-
-	V2P(i)	Vaihingen-source.pth	dg_vai2potIRRG_512x512.py	-
-	V2P(r)	Vaihingen-source.pth	dg_vai2potIRRG_512x512.py	-
LoveDA	Urban2Rural	Coming Soon	dg_loveda_rural2urban_1024x1024.py	Coming Soon
-	Rural2Urban	Coming Soon	dg_loveda_urban2rural_1024x1024.py	(Coming Soon)
WHU Building	A2S	WHU-Building-A2S.pth	dg_building_aerial2satellite.py	Coming Soon
-	S2A	WHU-Building-S2A.pth	dg_building_satellite2aerial.py	Coming Soon
DeepGlobe and Massachusetts	D2M	D2M-Rein/D2M-MTP	dg_deepglobe2massachusetts_1024x1024.py	Coming Soon
ISPRS Potsdam and RescueNet	P(r)2Res	Potsdam(r)2RescueNet.pth	dg_potsdamIRRG2rescue_512x512.py	Coming Soon
-	P(i)2Res	Potsdam(i)2RescueNet.pth	dg_potsdamRGB2rescue_512x512.py	Coming Soon
CASID	Sub2Sub	casid-sub-source.pth	dg_casid_subms_1024x1024	Coming Soon
-	Sub2Tem	casid-sub-source.pth	dg_casid_subms_1024x1024	-
-	Sub2Tms	casid-sub-source.pth	dg_casid_subms_1024x1024	-
-	Susb2Trf	casid-sub-source.pth	dg_casid_subms_1024x1024	-
-	Tem2Sub	casid-tem-source.pth	dg_casid_temms_1024x1024.py	Coming Soon
-	Tem2Tem	casid-tem-source.pth	dg_casid_temms_1024x1024.py	-
-	Tem2Tms	casid-tem-source.pth	dg_casid_temms_1024x1024.py	-
-	Tem2Trf	casid-tem-source.pth	dg_casid_temms_1024x1024.py	-
-	Tms2Sub	casid-tms-source.pth	dg_casid_troms_1024x1024.py	Coming Soon
-	Tms2Tem	casid-tms-source.pth	dg_casid_troms_1024x1024.py	-
-	Tms2Trf	casid-tms-source.pth	dg_casid_troms_1024x1024.py	-
-	Trf2Sub	casid-trf-source.pth	dg_casid_trorf_1024x1024.py	Coming Soon
-	Trf2Tem	casid-trf-source.pth	dg_casid_trorf_1024x1024.py	-
-	Trf2Tms	casid-trf-source.pth	dg_casid_trorf_1024x1024.py	-
-	Trf2Trf	casid-trf-source.pth	dg_casid_trorf_1024x1024.py	-

Citation

If you find CrossEarth helpful, please consider giving this repo a ⭐ and citing:

@article{crossearth,
  title={CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation},
  author={Gong, Ziyang and Wei, Zhixiang and Wang, Di and Ma, Xianzheng and Chen, Hongruixuan and Jia, Yuru and Deng, Yupeng and Ji, Zhenming and Zhu, Xiangwei and Yokoya, Naoto and Zhang, Jing and Du, Bo and Zhang, Liangpei},
  journal={arXiv preprint arXiv:2410.22629},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
CrossEarth		CrossEarth
benchmarks		benchmarks
configs		configs
convert_models		convert_models
images		images
rs_dataset		rs_dataset
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

🔥🔥🔥 News

📑 Table of Content

Visualization

Environment Requirements:

Inference steps:

Training steps:

Model Weights with Configs

Citation

Other Related Works

About

Releases

Packages

Contributors 3

Languages

License

Cuzyoung/CrossEarth

Folders and files

Latest commit

History

Repository files navigation

CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

🔥🔥🔥 News

📑 Table of Content

Visualization

Environment Requirements:

Inference steps:

Training steps:

Model Weights with Configs

Citation

Other Related Works

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages