Authors: Francesco Pezone, Sergio Barbarossa, Giuseppe Caire
SQ-GAN (Semantic Masked VQ-GAN) is a novel approach that leverages generative models for optimizing image compression in task-oriented communications. Our model selectively encodes semantically significant features using:
- Semantic segmentation for identifying key regions.
- Semantic-Conditioned Adaptive Mask Module (SAMM) for adaptive feature encoding.
- Advanced semantic-based compression approach outperforming JPEG2000 and BPG, particularly at extreme low bit rates.
Run the following code
conda env create -f environment.yml
conda activate sqgan_env
SQ-GAN is trained on the Cityscapes dataset. To prepare the dataset:
- Follow the dataset setup instructions from SPADE.
- Update
DefaultDataPath.Cityscapes.root
indata/default.py
with the correct dataset path.
SQ-GAN follows a 3-step training approach, training individual subnetworks before final joint training.
python3 train.py --mode ssm \
--gpus -1 \
--base config/sqgan_cityscapes.yml \
--max_epochs 150
python3 train.py --mode img \
--gpus -1 \
--base config/sqgan_cityscapes.yml \
--max_epochs 150
Before final training, the model must be split into components to allow different parts to load simultaneously without issues:
python3 train.py --mode ssm \
--gpus -1 \
--base config/sqgan_cityscapes.yml \
--max_epochs 150 \
--resume_from_checkpoint /path/to/G_s_checkpoint.ckpt \
--split_path Final_ckpt_parts/
python3 train.py --mode img \
--gpus -1 \
--base config/sqgan_cityscapes.yml \
--max_epochs 150 \
--resume_from_checkpoint /path/to/G_x_checkpoint.ckpt \
--split_path Final_ckpt_parts/
python3 train.py --mode all \
--gpus -1 \
--base config/sqgan_cityscapes.yml \
--max_epochs 150 \
--ckpt_path_parts Final_ckpt_parts/
Pre-trained models for the Cityscapes dataset are available for download:
Dataset | Checkpoint Link |
---|---|
Cityscapes | 📥 Download |
- Batchsize: in
config/sqgan_cityscapes.yml
set the dataset batchsize to 1 - Semantic Segmentation Model: SQ-GAN uses InternImage for generating segmentation maps.
- Modify Paths: In
sample.py
, updateinternimage_path
with the correct InternImage path. If using another segmentation model, modify lines 185-186 accordingly.
Option 1: Evaluate from split submodels
python3 sample.py --mode all \
--base config/sqgan_cityscapes.yml \
--ckpt_path_parts Final_ckpt_parts/ \
--gpus -1
Option 2: Evaluate from a single checkpoint
python3 sample.py --mode all \
--base config/sqgan_cityscapes.yml \
--resume_from_checkpoint /path/to/merged_checkpoint.ckpt \
--gpus -1
Results are stored in the Result/
folder. Multiple subfolders are crated for the desired combination of masking fractions and the performances are saved in metrics.csv
.
The code is based on MQ-VAE
If you find this work useful, please cite our paper:
@article{SQ-GAN,
title={SQ-GAN: Semantic Image Communication Using Masked Vector Quantization},
author={Francesco Pezone, Sergio Barbarossa, Giuseppe Caire},
journal={arXiv preprint arXiv:2502.09520},
year={2025}
}