Haozhi Cao1, Yuecong Xu2, Jianfei Yang3*, Pengyu Yin1, Shenghai Yuan1, Lihua Xie1
1: Centre for Advanced Robotics Technology Innovation (CARTIN), Nanyang Technological University
2: Department of Electrical and Computer Engineering, National University of Singapor
3: School of EEE, Nanyang Technological University
📄 [Arxiv] | 🎬 [Video] | 📖 [IEEEXplore]
MoPA is an MM-UDA method that aims to alleviate the imbalanced class-wise performance on Rare Objects (ROs) and the lack of 2D dense supervision signals through Valid Ground-based Insertion (VGI) and Segment Anything Mask consistency (SAM consistency). The overall structure is as follows.
Specifically, VGI inserts more ROs from the wild with ground truth to guide the recognition of ROs during UDA process without introducing artificial artifacts, while SAM consistency leverages image masks from Segment Anything Model to encourage mask-wise prediction consistency.
To ease the effort during environment setup, we recommend you leverage Docker and NVIDIA Container Toolkit. With Docker installed, you can either locally build the docker image for MoPA using this Dockerfile by running docker build -t mopa docker/
, or pull our pre-built image from Dockerhub by docker pull aroncao49/mopa:latest
.
You can then run a container using the docker image. Before running our code in the container, some prerequisites are needed to be installed. To do so, go to this repo folder and run bash install.sh
.
Remarks: you may ignore the ERROR warning saying werkzeug version is not compatible with open3d.
To install Patchwork++ for ground identification, follow the below command:
# Make sure you are in this repo folder
$ mkdir mopa/third_party && cd mopa/third_party
$ git clone https://github.com/url-kaist/patchwork-plusplus
$ cd patchwork-plusplus && make pyinstall
Please refer to DATA_PREPARE.md for the data preparation and pre-processing details.
Here we provide our pre-trained checkpoints for testing:
Method | USA→Singapore | Day→Night | A2D2→KITTI | |||||||||
2D | 3D | xM | ckpt | 2D | 3D | xM | ckpt | 2D | 3D | xM | ckpt | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
xMUDA | 58.5 | 51.2 | 61.0 | link | 47.7 | 42.1 | 52.3 | link | 42.6 | 44.9 | 47.2 | link |
MoPA+PL | 61.8 | 57.8 | 64.5 | link | 51.9 | 46.9 | 54.8 | link | 49.1 | 56.2 | 54.1 | link |
MoPA+PLx2 | 62.1 | 56.8 | 64.0 | link | 51.7 | 44.6 | 55.3 | link | 50.0 | 56.8 | 54.7 | link |
Note: During our refactoration, we find the same reproduction issue as in the vanilla xMUDA (as in this issue), where the performance fluctuates among different runs. This happens much more frequently on NuScenes benchmarks (Day→Night especially), so we suggest you to use our provided checkpoints for performance validation.
Before conducting training and testing, you are suggested to create or an output directory to capture the logs and checkpoints, and link that folder to mopa/exp
.
To conduct testing on, for example, A2D2→KITTI, simply download and extract the checkpoints or prepare your own trained networks, and use the following command:
$ CUDA_VISIBLE_DEVICES=0 python mopa/test/test.py \\
--cfg=configs/a2d2_semantic_kitti/xmuda_pl_pcmm_ema.yaml \\
--model_prefix=/path/to/checkpoint/dir \\
--ckpt2d=/2d_checkpoint/name \\
--ckpt3d=/3d_checkpoint/name \\
The class-wise results will be stored as a *.xls file in your checkpoint folder.
To generate pseudo-labels for training, include the extra arguments in the command:
$ CUDA_VISIBLE_DEVICES=0 python mopa/test/test.py \\
--cfg=configs/a2d2_semantic_kitti/xmuda_pl_pcmm_ema.yaml \\
--pselab_dir=DIR_NAME \\
VAL.BATCH_SIZE 1 DATASET_TARGET.TEST "('train',)"
The pseudo-labels will be stored in the folder ps_label/DIR_NAME
under the dataset root dir.
To conduct training with MoPA on, for example, A2D2→KITTI, simply use the following command:
$ CUDA_VISIBLE_DEVICES=0 python mopa/train/train_mopa.py \\
--cfg=configs/a2d2_semantic_kitti/xmuda_pl_pcmm_ema.yaml \\
DATASET_TARGET.SemanticKITTISCN.ps_label_dir DIR_NAME
You can also change those arguments in the config files directly.
- [2024.08] Our new MM-TTA paper for 3D segmentation has been accepted by ECCV 2024! Code will also be released soon. Check our project site for more details!
- [2024.08] Release training/testing details and all checkpoints. We may further release the ROs we extracted if permitted.
- [2024.05] Release installation, prerequisite details, and data preparation procedures.
- [2024.03] We are now refactoring our code and evaluating its feasibility. Code will be available shortly.
- [2024.01] Our paper is accepted by ICRA 2024! Check our paper on arxiv here.
For any further questions, please contact Haozhi Cao ([email protected])
We greatly appreciate the contributions of the following public repos:
@inproceedings{cao2024mopa,
title={Mopa: Multi-modal prior aided domain adaptation for 3d semantic segmentation},
author={Cao, Haozhi and Xu, Yuecong and Yang, Jianfei and Yin, Pengyu and Yuan, Shenghai and Xie, Lihua},
booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
pages={9463--9470},
year={2024},
organization={IEEE}
}