Here, we propose ConsDreamer, an innovative framework designed to address the Janus problem in text-to-3D generation by introducing: 1) a View Disentanglement Module and 2) a novel similarity-based partial order loss.
Click for the full abstract
Recent advances in zero-shot text-to-3D generation have revolutionized 3D content creation by enabling direct synthesis from textual descriptions. While state-of-the-art (SOTA) methods leverage 3D Gaussian Splatting with score distillation to enhance multi-view rendering through pre-trained T2I models, they suffer from inherent view biases in T2I priors that lead to inconsistent 3D generation, particularly manifesting as the multi-face Janus problem, where objects exhibit conflicting features across views.To address this fundamental challenge, we propose ConsDreamer, a novel framework that mitigates view bias by refining both the conditional and unconditional terms in the score distillation process:
- View Disentanglement Module (VDM): Eliminates viewpoint biases in conditional prompts by decoupling irrelevant view components and injecting precise camera parameters
- Similarity-based partial order loss: Enforces geometric consistency in the unconditional term by aligning cosine similarities with azimuth relationships
Extensive experiments validate that ConsDreamer effectively mitigates the multi-face Janus problem in text-to-3D generation, surpassing existing methods in both quality and consistency.
The implementation of ConsDreamer is mainly based on Python 3.9.16, CUDA 11.7 and PyTorch 2.0.1. To install all required dependencies:
The repository contains submodules; thus please check it out with
https://github.com/GAInuist/ConsDreamer.git
or
[email protected]:GAInuist/ConsDreamer.git
conda create -n ConsDreamer python=3.9.16 cudatoolkit=11.8
conda activate ConsDreamer
pip install -r requirements.txt
pip install submodules/diff-gaussian-rasterization/
pip install submodules/simple-knn/
cd CLIP_vit
pip install -e .
python ConsDreamer_train.py --opt <path to config file>
or you can try:
bash Run.sh
Parts of our codes based on many amazing research works and open-source projects:
- EnVision-Research/LucidDreamer
- graphdeco-inria/gaussian-splatting
- graphdeco-inria/diff-gaussian-rasterization
- ashawkey/stable-dreamfusion
- openai/point-e
- openai/CLIP
Thanks for their excellent work and great contribution to 3D generation area.