Skip to content

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Notifications You must be signed in to change notification settings

yunxiangfu2001/SegMAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Official Pytorch implementation of SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

SegMAN

Main Results

Installation and data preparation

Step 1: Create a new environment

conda create -n segman python=3.10
conda activate segman

pip install torch==2.1.2 torchvision==0.16.2

Step 2: Install MMSegmentation v0.30.0 by following the installation guidelines and prepare segmentation datasets by following data preparation. The following installation commands works for me:

pip install -U openmim
mim install mmcv-full
cd segmentation
pip install -v -e .

To support torch>=2.1.0, you also need to replace Line 75 of /miniconda3/envs/segman/lib/python3.10/site-packages/mmcv/parallel/_functions.py with the following:

if version.parse(torch.__version__) >= version.parse('2.1.0'):
    streams = [_get_stream(torch.device("cuda", device)) for device in target_gpus]
else:
    streams = [_get_stream(device) for device in target_gpus]

Step 3: Install dependencies using the following commands.

To install Natten, you should modify the following with your PyTorch and CUDA versions accordingly.

pip install natten==0.17.3+torch210cu121 -f https://shi-labs.com/natten/wheels/

The Selective Scan 2D can be install with:

cd kernels/selective_scan && pip install .

Install other requirements:

pip install -r requirements.txt

Training

Download the ImageNet-1k pretrained weights here and put them in a folder pretrained/. Navigate to the segmentation directory:

cd segmentation

Scripts to reproduce our paper results are provided in ./scripts Example training script for SegMAN-B on ADE20K:

# Single-gpu
python tools/train.py local_configs/segman/base/segman_b_ade.py --work-dir outputs/EXP_NAME

# Multi-gpu
bash tools/dist_train.sh local_configs/segman/base/segman_b_ade.py <GPU_NUM> --work-dir outputs/EXP_NAME

Evaluation

Download trained weights for segmentation models at google drive. Navigate to the segmentation directory:

cd segmentation

Example for evaluating SegMAN-B on ADE20K:

# Single-gpu
python tools/test.py local_configs/segman/base/segman_b_ade.py /path/to/checkpoint_file

# Multi-gpu
bash tools/dist_test.sh local_configs/segman/base/segman_b_ade.py /path/to/checkpoint_file <GPU_NUM>

Encoder Pre-training

We provide scripts for pre-training the encoder from scratch.

Step 1: Download ImageNet-1k and using this script to extract it.

Step 2: Start training with

bash scripts/train_segman-s.sh

Acknowledgements

Our implementation is based on MMSegmentaion, Natten, VMamba, and SegFormer. We gratefully thank the authors.

Citation

@article{SegMAN,
      title={SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation}, 
      author={Yunxiang Fu and Meng Lou and Yizhou Yu},
      journal={arXiv preprint arXiv:2412.11890},
      year={2024}
}

About

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published