This repository provides a Streamlit application for training your own Stable Diffusion XL (SDXL) models. The interface allows you to configure training parameters, manage datasets, generate images, and upload datasets to the Hugging Face Hub.
- Interactive Streamlit UI for configuring model training parameters.
- Support for custom datasets with image-caption pairs.
- Image generation at specified intervals during training.
- Capability to generate images from prompts using the trained model.
- Upload datasets to the Hugging Face Hub.
- Python 3.8 or higher.
- CUDA-enabled GPU for training and inference.
- Streamlit
- PyTorch
- Hugging Face Transformers
- Datasets
- Hugging Face Hub
- Diffusers library (must be installed separately; see Installation)
- A Hugging Face account and access token
-
Clone the repository:
git clone https://github.com/birdhouses/SDXL-Trainer.git cd SDXL-Trainer
-
Install the required Python packages:
pip install streamlit torch transformers datasets huggingface_hub
-
Install the
diffusers
library separately:The
diffusers
library is a core component for running diffusion models like SDXL. It is not included as a dependency in this repository and must be installed separately.pip install diffusers
Alternatively, install it from the source for the latest features:
git clone https://github.com/huggingface/diffusers.git cd diffusers pip install -e .
For detailed installation instructions and troubleshooting, refer to the Diffusers GitHub Repository.
-
Ensure CUDA is properly configured:
- Verify that your system has a CUDA-enabled GPU.
- Install the appropriate NVIDIA drivers and CUDA toolkit compatible with PyTorch.
- For guidance, refer to the PyTorch CUDA Installation Guide.
- Sign up for a Hugging Face account here.
- Navigate to your access tokens page.
- Create a new token with the necessary permissions (write access) and save it securely.
-
Dataset Structure:
Organize your dataset in a directory where each image has an accompanying
.txt
file containing its caption. For example:dataset/ ├── image1.jpg ├── image1.txt ├── image2.jpg ├── image2.txt └── ...
-
Automatic Annotation:
If you need to automatically generate annotations (the
.txt
files), use the ImageAnnotator tool. -
Image Scraping:
To scrape images for your dataset, use the image_scraper tool.
-
Training Script Path:
You need to point to the
train_text_to_image_sdxl.py
script from thediffusers
library. This script is essential for training the SDXL model. -
Locate the Script:
If you've installed
diffusers
from the source, the training script is typically located at:/path/to/diffusers/examples/text_to_image/train_text_to_image_sdxl.py
-
Modify the Script (Optional):
Ensure that the training script is compatible with your training parameters and setup. You might need to adjust paths or parameters within the script.
Start the Streamlit app from the terminal:
streamlit run app.py
In the Streamlit interface:
- Model Path: Path to your SDXL model (e.g.,
stabilityai/stable-diffusion-xl-base-1.0
). - Dataset Path: Path to your dataset directory.
- VAE Model Path: Path to your VAE model (e.g.,
madebyollin/sdxl-vae-fp16-fix
). - Output Directory: Path where outputs and checkpoints will be saved.
- Training Script Path: Path to your
train_text_to_image_sdxl.py
script from thediffusers
library. - Validation Prompt: Prompt used for generating validation images during training.
- Training Parameters: Set resolution, batch size, learning rate, etc.
- Hugging Face Token: Enter your Hugging Face access token when prompted.
-
Click Create Dataset and Train Model to begin training.
-
The application will:
- Load your dataset and create a
metadata.csv
file if it doesn't exist. - Initialize the diffusion pipeline using your specified model.
- Start the training process by invoking the training script.
- Generate images at specified intervals during training.
- Load your dataset and create a
-
Monitor training progress and generated images in the interface.
- Use the Generate Image section to generate images from prompts using the trained model.
- Enter a prompt and click Generate Image.
- The generated image will be displayed and saved in the output directory.
- Provide the path to your dataset and the desired dataset name on Hugging Face Hub.
- Enter your Hugging Face access token.
- Click Upload Dataset To HuggingFace to upload your dataset.
- Diffusers Documentation: Diffusers GitHub Repository
- Automatic Annotation: ImageAnnotator
- Image Scraping: image_scraper
-
Diffusers Library:
The
diffusers
library is not included as a dependency in this repository. You must install it separately as per the instructions above. This library provides the necessary tools and scripts for diffusion models like SDXL. -
Training Script Path:
Ensure that the
script_path
provided in the application points to thetrain_text_to_image_sdxl.py
script from thediffusers
library. This script is essential for training and must be correctly referenced. -
Hardware Requirements:
Training diffusion models is resource-intensive. Ensure your hardware meets the requirements (a powerful GPU with sufficient VRAM).
-
Paths and Permissions:
Make sure all paths provided in the interface are correct and accessible, and you have the necessary read/write permissions.
-
CUDA Errors:
- Update your GPU drivers and verify CUDA compatibility with PyTorch.
- Ensure that PyTorch is installed with CUDA support.
-
Missing Dependencies:
- Install missing packages using
pip install -r requirements.txt
.
- Install missing packages using
-
Authentication Errors:
- Confirm your Hugging Face token is correct and has the necessary permissions.
- Re-enter your token if authentication fails.
-
Diffusers Installation Issues:
- If you encounter issues with
diffusers
, refer to their GitHub Repository for installation guides and troubleshooting.
- If you encounter issues with
-
Streamlit Interface:
Provides an interactive UI for configuring training parameters and managing datasets.
-
Dataset Preparation:
The application creates a
metadata.csv
file pairing images with their captions if it doesn't exist. -
Training Process:
Utilizes the
train_text_to_image_sdxl.py
script from thediffusers
library to train the model based on your configurations. -
Image Generation:
Generates images at specified intervals during training and allows for prompt-based image generation using the trained model.
-
Uploading to Hugging Face Hub:
Allows you to upload your prepared dataset to the Hugging Face Hub for sharing or further training.
Contributions are welcome! Please open an issue or submit a pull request.