[AAAI2025] MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation 
We proposed MoRe to effectively tackle the artifact issue when generating Localization Attention Map (LAM) from class-patch attention in WSSS.
- If you find this work helpful, please give us a 🌟 to receive the updation !
Dec. 10th, 2024
: MoRe is accepted by AAAI2025.- All code is available. 🔥🔥🔥
Weakly Supervised Semantic Segmentation (WSSS) with image-level labels typically uses Class Activation Maps (CAM) to achieve dense predictions. Recently, Vision Transformer (ViT) has provided an alternative to generate localization maps from class-patch attention. However, due to insufficient constraints on modeling such attention, we observe that the Localization Attention Maps (LAM) often struggle with the artifact issue, i.e., patch regions with minimal semantic relevance are falsely activated by class tokens. In this work, we propose MoRe to address this issue and further explore the potential of LAM. Our findings suggest that imposing additional regularization on class-patch attention is necessary. To this end, we first view the attention as a novel directed graph and propose the Graph Category Representation module to implicitly regularize the interaction among class-patch entities. It ensures that class tokens dynamically condense the related patch information and suppress unrelated artifacts at a graph level. Second, motivated by the observation that CAM from classification weights maintains smooth localization of objects, we devise the Localization-informed Regularization module to explicitly regularize the class-patch attention. It directly mines the token relations from CAM and further supervises the consistency between class and patch tokens in a learnable manner. Extensive experiments are conducted on PASCAL VOC and MS COCO, validating that MoRe effectively addresses the artifact issue and achieves state-of-the-art performance, surpassing recent single-stage and even multi-stage methods.
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
The augmented annotations are from SBD dataset. The download link of the augmented annotations at
DropBox. After downloading SegmentationClassAug.zip
, you should unzip it and move it to VOCdevkit/VOC2012/
.
VOCdevkit/
└── VOC2012
├── Annotations
├── ImageSets
├── JPEGImages
├── SegmentationClass
├── SegmentationClassAug
└── SegmentationObject
wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip
To generate VOC style segmentation labels for COCO, you could use the scripts provided at this repo, or just download the generated masks from Google Drive.
COCO/
├── JPEGImages
│ ├── train2014
│ └── val2014
└── SegmentationClass
├── train2014
└── val2014
Please refer to the requirements.txt.
We incorporate a regularization loss for segmentation. Please refer to the instruction for this python extension.
### train voc
bash run_train.sh scripts/train_voc.py [gpu_device] [gpu_number] [master_port] train_voc
### train coco
bash run_train.sh scripts/train_coco.py [gpu_devices] [gpu_numbers] [master_port] train_coco
### eval voc seg and LAM
bash run_evaluate_voc.sh [gpu_device] [gpu_number] [checkpoint_path]
### eval coco seg
bash run_evaluate_seg_coco.sh tools/infer_seg_coco.py [gpu_device] [gpu_number] [infer_set] [checkpoint_path]
Semantic performance on VOC and COCO. Logs and weights are available now.
Dataset | Backbone | Val | Test | Log |
---|---|---|---|---|
PASCAL VOC | ViT-B | 76.4 | 75.0 | log |
MS COCO | ViT-B | 47.4 | - | log |
Please cite our work if you find it helpful to your reseach. 💕
@article{yang2024more,
title={MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation},
author={Yang, Zhiwei and Meng, Yucong and Fu, Kexue and Wang, Shuo and Song, Zhijian},
journal={arXiv preprint arXiv:2412.11076},
year={2024}
}
If you have any questions, please feel free to contact the author by [email protected].
This repo is built upon MCTformer Series and SeCo. Many thanks to their brilliant works!!!