Skip to content

xwmaxwma/TinyViM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TinyViM

TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba

Xiaowen Ma, Zhenliang Ni, Xinghao Chen

Huawei Noah’s Ark Lab

[Paper Link]

🔥 News

  • 2024/11/29: Code is open.
  • 2024/11/27: TinyViM is available at Arxiv.

📷 Introduction

We build a series of tiny hybrid vision Mamba called TinyViM by integrating mobile-friendly convolution and efficient Laplace mixer. The proposed TinyViM achieves impressive performance on several downstream tasks including image classification, semantic segmentation, object detection and instance segmentation. In particular, TinyViM outperforms Convolution, Transformer and Mamba-based models with similar scales, and the throughput is about 2-3 times higher than that of other Mamba-based models.

🏆 Performance

1️⃣ Classification

Model Type Params (M) GMACs Throughput (im/s) Top-1
TinyViM-S CNN-Mamba 5.6 0.9 2563 79.2
TinyViM-B CNN-Mamba 11.0 1.5 1851 81.2
TinyViM-L CNN-Mamba 31.7 4.7 843 83.3

2️⃣ Detection & Instance Segmentation

Model Head AP-box AP-mask
TinyViM-B Mask RCNN 42.3 38.7
TinyViM-L Mask RCNN 44.5 40.7

3️⃣ Semantic Segmentation

Model Head Throughput mIoU
TinyViM-B FPN 180 41.9
TinyViM-L FPN 111 44.2

📚 Use example

  • Environment

    conda create --name tinyvim python=3.9.11 -y
    conda activate tinyvim
    conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.7 -c pytorch -c nvidia
    pip install timm==0.5.4

    Please refer to VMamba for installing selective_scan_cuda.

    Please refer to mmdetection-2.28.2 and mmsegmentation-0.30.0 for environments and data preparation of detection and segmentation, respectively.

  • Train

    bash train.sh
  • Test

    bash eval.sh
  • speed

    python speed_gpu.py --model TinyViM_S --resolution 224 --batch 2048
  • Detection & Instance Segmentation

    cd detection
    bash train.sh # for train
    bash eval.sh # for eval
  • Semantic Segmentation

    cd segmentation
    bash train.sh # for train
    bash eval.sh # for eval

🌟 Citation

If you are interested in our work, please consider giving a 🌟 and citing our work below.

@misc{tinyvim,
      title={TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba}, 
      author={Xiaowen Ma and Zhenliang Ni and Xinghao Chen},
      year={2024},
      eprint={2411.17473},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17473}, 
}

💡Acknowledgment

Thanks to previous open-sourced repo: Efficientformer, Swiftformer, RepViT, mmsegmentation, mmdetection

About

Official Pytorch Implementation of TinyViM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published