Skip to content
/ UNIP Public

(ICLR 2025) The official pytorch implementation of "UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation"

License

Notifications You must be signed in to change notification settings

casiatao/UNIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UNIP

This repository contains the official pytorch implementation of the paper "UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation".

🎉 News

  • 2025.02.19 🔥 The pre-trained models are released!
  • 2025.02.05 🔥 The source code is publicly available!
  • 2025.01.23 🎉 Congratulations! Our paper has been accepted by ICLR 2025.

📖 Introduction

architecture_v8_00

In this work, we first benchmark the infrared semantic segmentation performance of various pre-training methods and reveal several phenomena distinct from the RGB domain. Next, our layerwise analysis of pre-trained attention maps uncovers that: (1) There are three typical attention patterns (local, hybrid, and global); (2) Pre-training tasks notably influence the pattern distribution across layers; (3) The hybrid pattern is crucial for semantic segmentation as it attends to both nearby and foreground elements; (4) The texture bias impedes model generalization in infrared tasks. Building on these insights, we propose UNIP, a UNified Infrared Pre-training framework, to enhance the pre-trained model performance. This framework uses the hybrid-attention distillation NMI-HAD as the pre-training target, a large-scale mixed dataset InfMix for pre-training, and a last-layer feature pyramid network LL-FPN for fine-tuning.

distill_query_attn_v2_00

Experimental results show that UNIP outperforms various pre-training methods by up to 13.5% in average mIoU on three infrared segmentation tasks, evaluated using fine-tuning and linear probing metrics. UNIP-S achieves performance on par with MAE-L while requiring only 1/10 of the computational cost. Furthermore, UNIP significantly surpasses state-of-the-art (SOTA) infrared or RGB segmentation methods and demonstrates the broad potential for application in other modalities, such as RGB and depth.

benchmark

🛠️ Usage

Pre-training

  1. Create conda environments and install packages.
# create environment
conda create -n unip_pre python=3.9
conda activate unip_pre
# install pytorch and torchvision
conda install pytorch==2.2.0 torchvision==0.17.0 pytorch-cuda=12.1 -c pytorch -c nvidia
# install other packages
pip install -r UNIP_pretraining/requirements.txt
  1. Training
  • UNIP-S (MAE-L)
cd UNIP_pretraining
sh train_scripts/mae-l_distill_unip-s.sh
  • UNIP-S (DINO-B)
cd UNIP_pretraining
sh train_scripts/dino-b_distill_unip-s.sh
  • UNIP-S (iBOT-L)
cd UNIP_pretraining
sh train_scripts/ibot-l_distill_unip-s.sh
  • UNIP-B (MAE-L)
cd UNIP_pretraining
sh train_scripts/mae-l_distill_unip-b.sh
  • UNIP-T (MAE-L)
cd UNIP_pretraining
sh train_scripts/mae-l_distill_unip-t.sh

Note: Please download the pre-trained checkpoints of each teacher model from their official repositories, and change the model_path, log_dir, and output_dir in the training scripts.

Semantic Segmentation

  1. Create conda environments and install packages.
# create environment
conda create -n unip_seg python=3.8
conda activate unip_seg
# install pytorch and torchvision
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
# install the mmsegmentation library
pip install mmcv-full==1.3.0 mmsegmentation==0.11.0
# install other packages
pip install -r UNIP_segmentation/requirements.txt
# install apex
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--global-option=--cpp_ext" --config-settings "--global-option=--cuda_ext" ./
  1. Training
  • UNIP-S
cd UNIP-segmentation
sh train_scripts/train_vit_small.sh
  • UNIP-B
cd UNIP-segmentation
sh train_scripts/train_vit_base.sh
  • UNIP-T
cd UNIP-segmentation
sh train_scripts/train_vit_tiny.sh
  1. Pre-trained Model

The pre-trained models distilled by MAE-Large and iBOT-Large can be available for download via Google Drive or Hugging Face.

Citation

If you find this repository helpful, please consider giving it a star and citing:

@inproceedings{
  zhang2025unip,
  title={{UNIP}: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation},
  author={Tao Zhang and Jinyong Wen and Zhen Chen and Kun Ding and Shiming Xiang and Chunhong Pan},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=Xq7gwsnhPT}
}

Acknowledgements

This codebase is built upon the MAE repository, the iBOT repository, the mmsegmentation repository, and the PAD repository. Thanks for their contributions.

About

(ICLR 2025) The official pytorch implementation of "UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published