Code for the paper "IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models"
TLDR: IQA-Adapter is a tool that combines Image Quality/Aesthetics Assessment (IQA/IAA) models with image-generation and enables quality-aware generation with diffusion-based models. It allows to condition image generators on target quality/aesthetics scores.
IQA-Adapter builds upon IP-Adapter architecture.
TODO list:
- Release code for IQA-Adapter inference and training for SDXL base model (in progress)
- Release weights for IQA-Adapters trained with different IQA/IAA models (in progress)
- Create project page
- Release code for experiments
Demonstration of guidance on quality (y-axis) and aesthetics (x-axis) scores:
First, clone this repository:
git clone https://github.com/X1716/IQA-Adapter.git
Next, create a virtual environment, e.g. with anaconda:
conda create --name iqa_adapter python=3.12.2
conda activate iqa_adapter
Install PyTorch suitable for your CUDA/ROCm/MPS device:
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
# for CUDA 12.1
Newer Python and PyTorch versions should also work.
Install other requirements for this project:
pip install -r requirements.txt
To test a pretrained IQA-Adapter you can check out demo_adapter.ipynb jupyter notebook. The weights for the IQA-Adapter can be downloaded from here (Google Drive).
train_iqa_adapter.py can be used to train/fine-tune IQA-Adapter. We trained it on a SLURM cluster with slurm_train_script.sh. Train job can be dispatched with:
sbatch slurm_train_script.sh
Note that this script should be modified for your particular cluster setup (e.g., paths to input/output directories, pyXis container and other things should be specified). It is configured for distributed training with 5 nodes and 8 GPUs per node.
If you find this work useful for your research, please cite us as follows:
@misc{iqaadapter,
title={IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models},
author={Khaled Abud and Sergey Lavrushkin and Alexey Kirillov and Dmitriy Vatolin},
year={2024},
eprint={2412.01794},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.01794},
}