How far can we go with ImageNet for Text-to-Image generation?

Lucas Degeorge, Arijit Ghosh, Nicolas Dufour, Vicky Kalogeiton, David Picard

This repo has the code for the paper "How far can we go with ImageNet for Text-to-Image generation?"

The core idea is that text-to-image generation models typically rely on vast datasets, prioritizing quantity over quality. The usual solution is to gather massive amounts of data. We propose a new approach that leverages strategic data augmentation of small, well-curated datasets to enhance the performance of these models. We show that this method improves the quality of the generated images on several benchmarks.

Paper on Arxiv: coming soon Project website: coming soon

Install

To install, first create a virtual environment with python (at least 3.9) and run

pip install -e .

If you want to use the training pipeline (see training/README.md):

pip install .[train]

Depending of your CUDA version be careful installing torch.

Text and Pixel augmentation recipe

See data_augmentations/README.md

Training

See training/README.md

Citation

If you happen to use this repo in your experiments, you can acknowledge us by citing the following paper:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
t2i_imagenet		t2i_imagenet
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How far can we go with ImageNet for Text-to-Image generation?

Install

Text and Pixel augmentation recipe

Training

Citation

About

Releases

Packages

Languages

License

lucasdegeorge/T2I-ImageNet

Folders and files

Latest commit

History

Repository files navigation

How far can we go with ImageNet for Text-to-Image generation?

Install

Text and Pixel augmentation recipe

Training

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages