3DIS: DEPTH-DRIVEN DECOUPLED INSTANCE SYNTHESIS FOR TEXT-TO-IMAGE GENERATION

[Project Page] [3DIS Paper] [3DIS-FLUX Paper] [Huggingface Page]

🔥🔥🔥 News

2025-01-22: Our paper 3DIS is accepted by ICLR 2025.
2025-01-27: We have released the code for rendering with the SD1.x model to meet the needs of more researchers.

To Do List

Code
pretrained weights
3DIS GUI
More Demos

Installation

Conda environment setup

conda create -n 3DIS python=3.10 -y
conda activate 3DIS
pip install -r requirement.txt
pip install -e .
cd segment-anything-2
pip install -e . --no-deps
cd ..

Checkpoints 🚀

Step1 Download the checkpoint of the fine-tuned Text-to-Depth model, unet_0901.ckpt.

Step2 Download the checkpoint of the trained Layout-to-Depth Adapter, layout_adapter.ckpt.

You can also get our pretrained weights from Huggingface🤗.

Step3 Download the checkpoint of the SAM2, sam2_hiera_large.pt.

Step4 put them under the 'pretrained_weights' folder.

├── pretrained_weights
│   ├── unet_0901.ckpt
│   ├── layout_adapter.ckpt
│   ├── sam2_hiera_large.pt
├── threeDIS
│   ├── ...
├── scripts
│   ├── ...

Layout-to-Depth Generation 🎨

Single Image Example

You can quickly run inference for layout-to-depth generation using the following command:

python scripts/inference_layout2depth_demo0.py

python scripts/inference_layout2depth_demo1.py

Rendering Generated Scene with Various Models 🌈

Rendering with FLUX ✨

You can quickly run inference for FLUX rendering using the following command:

python scripts/inference_flux_rendering_sam_demo0.py  --width=768 --height=1024 --i2i=4 --use_sam_enhance

python scripts/inference_flux_rendering_sam_demo1.py  --use_sam_enhance --res=512 --i2i=4

python scripts/inference_flux_rendering_sam_demo2.py  --use_sam_enhance --res=768 --i2i=3

Rendering with SD1.x 🖼️

You can quickly run inference for SD1.x rendering using the following command:

python scripts/inference_sd1_rendering_sam_demo0.py  --control_CN  --fft

Due to the limited generation capabilities of the SD1.x model, you can also try other better base models on Civitai, such as CetusMix, RV60B1, etc., to achieve better generation results.

python scripts/inference_sd1_rendering_sam_demo1.py  --control_CN  --fft

More interesting demos will be coming soon!!!

End-to-end Layout-to-Image Generation 📐

You can quickly run inference for end-to-end layout-to-image generation using the following command:

python scripts/inference_layout2image_demo0.py --use_sam_enhance

Rendering Real Scene Depth Maps 📚

You can also apply our method to render the scene depth map extracted from a real-world image:

python scripts/inference_flux_rendering_sam_demo3.py  --height=512 --width=768 --i2i=4 --use_sam_enhance

python scripts/inference_flux_rendering_sam_demo5.py  --height=768 --width=640 --i2i=2

Rendering with LoRA📚

Rendering with the Miku LoRA:

python scripts/inference_flux_rendering_sam_demo4.py  --height=1024 --width=768 --i2i=2

Create with 3DIS GUI ⭐️

Use the following command to create a scene depth map with 3DIS GUI:

cd 3dis_gui
python layout2depth_app.py --port=3421

Use the following command to render the scene depth map with 3DIS GUI using FLUX:

cd 3dis_gui
python flux_rendering_app.py --port=3477

Citation

If you find this repository useful, please use the following BibTeX entry for citation.

@article{zhou20243dis,
  title={3dis: Depth-driven decoupled instance synthesis for text-to-image generation},
  author={Zhou, Dewei and Xie, Ji and Yang, Zongxin and Yang, Yi},
  journal={arXiv preprint arXiv:2410.12669},
  year={2024}
}

@article{zhou20253disflux,
  title={3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering},
  author={Zhou, Dewei and Xie, Ji and Yang, Zongxin and Yang, Yi},
  journal={arXiv preprint arXiv:2501.05131},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
3dis_gui		3dis_gui
data		data
figures		figures
pretrained_weights		pretrained_weights
scripts		scripts
segment-anything-2		segment-anything-2
static		static
threeDIS		threeDIS
README.md		README.md
fig1.png		fig1.png
index.html		index.html
overview.png		overview.png
requirement.txt		requirement.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3DIS: DEPTH-DRIVEN DECOUPLED INSTANCE SYNTHESIS FOR TEXT-TO-IMAGE GENERATION

🔥🔥🔥 News

To Do List

Installation

Conda environment setup

Checkpoints 🚀

Layout-to-Depth Generation 🎨

Single Image Example

Rendering Generated Scene with Various Models 🌈

Rendering with FLUX ✨

Rendering with SD1.x 🖼️

End-to-end Layout-to-Image Generation 📐

Rendering Real Scene Depth Maps 📚

Rendering with LoRA📚

Create with 3DIS GUI ⭐️

Citation

About

Releases

Packages

Contributors 2

Languages

limuloo/3DIS

Folders and files

Latest commit

History

Repository files navigation

3DIS: DEPTH-DRIVEN DECOUPLED INSTANCE SYNTHESIS FOR TEXT-TO-IMAGE GENERATION

🔥🔥🔥 News

To Do List

Installation

Conda environment setup

Checkpoints 🚀

Layout-to-Depth Generation 🎨

Single Image Example

Rendering Generated Scene with Various Models 🌈

Rendering with FLUX ✨

Rendering with SD1.x 🖼️

End-to-end Layout-to-Image Generation 📐

Rendering Real Scene Depth Maps 📚

Rendering with LoRA📚

Create with 3DIS GUI ⭐️

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages