Skip to content

YouHuang67/focsam

Repository files navigation

 

Introduction

This repository contains the implementation for CVPR2024 paper

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Demo

The following GIF animations display a comparison of interactive segmentation results between SAM and our FocSAM. Notably, FocSAM demonstrates a remarkably stable performance with significantly less fluctuation in IoU compared to SAM, across various datasets.

Installation

For detailed installation instructions, please refer to INSTALL.

Alternatively, ensure you have Python version 3.11.0 set up in your environment. Then, install all dependencies by running the following command in your terminal:

bash scripts/install.sh

Dataset Preparation

For detailed dataset preparation instructions, please refer to DATASETS.

Model Weights Download and Conversion

SAM Pre-trained Weights

  • Download: Acquire the pretrained SAM-ViT-H and save it to pretrain/sam_vit_h_4b8939.pth.
  • Conversion: Convert the downloaded weights using the command below:
python tools/model_converters/samvit2mmclickseg.py pretrain/sam_pretrain_vit_huge.pth

FocSAM Pre-trained Weights

  • Download: Obtain the pretrained FocSAM-ViT-H, and unzip it in work_dirs/focsam/focsam_vit_huge_eval.

Evaluating the Model

  • Single GPU (Example for DAVIS dataset):
export PYTHONPATH=.
python tools/test_no_viz.py configs/_base_/eval_davis.py work_dirs/focsam/focsam_vit_huge_eval/iter_160000.pth
  • Multi-GPU:
bash tools/dist_test.sh configs/_base_/eval_davis.py work_dirs/focsam/focsam_vit_huge_eval/iter_160000.pth 4
  • CPU (Not recommended):
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES= python tools/test_no_viz.py configs/_base_/eval_davis.py work_dirs/focsam/focsam_vit_huge_eval/iter_160000.pth
  • Evaluating on Other Datasets: Replace the config file for other datasets as needed:
configs/_base_/eval_sbd.py  # for SBD
configs/_base_/eval_grabcut.py  # for GrabCut 
configs/_base_/eval_berkeley.py  # for Berkeley
configs/_base_/eval_mvtec.py  # for MVTec
configs/_base_/eval_cod10k.py  # for COD10K

Training the Model

Training SAM Decoder

  • Single GPU:
export PYTHONPATH=.
python tools/train.py configs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k.py
  • Multi-GPU:
bash tools/dist_train.sh configs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k.py 4
  • CPU (Not recommended):
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES= python tools/train.py configs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k.py

Training FocSAM Refiner

  • Important Pre-requisite: Begin by training the SAM decoder. This step produces the required file work_dirs/sam/coco_lvis/train_colaug_coco_lvis_1024x1024_320k/iter_320000.pth, which is essential for the subsequent training of the FocSAM refiner.

  • Single GPU:

export PYTHONPATH=.
python tools/train.py configs/focsam/coco_lvis/train_colaug_coco_lvis_1024x1024_160k.py
  • Multi-GPU:
bash tools/dist_train.sh configs/focsam/coco_lvis/train_colaug_coco_lvis_1024x1024_160k.py 4
  • CPU (Not recommended):
export PYTHONPATH=.
CUDA_VISIBLE_DEVICES= python tools/train.py configs/focsam/coco_lvis/train_colaug_coco_lvis_1024x1024_160k.py

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages