This repo is a PyTorch implementation for paper: Progressive Feature Self-Reinforcement for Weakly Supervised Semantic Segmentation
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
The augmented annotations are from SBD dataset. Here is a download link of the augmented annotations at
DropBox. After downloading SegmentationClassAug.zip
, you should unzip it and move it to VOCdevkit/VOC2012/
.
VOCdevkit
└── VOC2012
├── Annotations
├── ImageSets
├── JPEGImages
├── SegmentationClass
├── SegmentationClassAug
└── SegmentationObject
wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip
To generate VOC style segmentation labels for COCO, you could use the scripts provided at this repo, or just download the generated masks from Google Drive.
COCO
├── JPEGImages
│ ├── train2014
│ └── val2014
└── SegmentationClass
├── train2014
└── val2014
Please refer to requirements.txt
Our implementation incorporates a regularization term for segmentation. Please download and compile the python extension.
The encoder is vit_base_patch16_224
pretrained on ImageNet. Download the weights to ./pretrained/
.
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 train_voc.py --data_folder [VOCdevkit/VOC2012]
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 train_coco.py --data_folder [COCO]
arguments most related to this project:
--cls_depth number of aggregation modules
--out_dim dimension of the projector output
--momentum EMA update parameter for teacher
--use_mim whether to enable masking
--block_size masking block size, must be a multiple of ViT patch size
--mask_ratio masking ratio
--w_class FSR loss weight for the aggregated token
--w_patch FSR loss weight for masked patch tokens
infer_*.py
will apply dense CRF to the predicted segmentation labels.
python infer_voc.py --checkpoint [PATH_TO_CHECKPOINT] --data_folder [VOCdevkit/VOC2012] --infer_set [val | test] --save_cam [True | False]
python infer_coco.py --checkpoint [PATH_TO_CHECKPOINT] --data_folder [COCO] --infer_set val --save_cam [True | False]
This repo is built upon ToCo. Our work is greatly inspired by DINO. Many thanks to their brilliant works!