Diffusion Models Overview

Model-Specific Technical Information for Diffusion Models in OneTrainer

This wiki page provides detailed technical information about the diffusion models supported by OneTrainer, specifically focusing on SD1.5, SDXL, and Flux. The information is tailored for quite advanced users.

SD1.5

Base Architecture

SD1.5 utilizes a UNet architecture with an encoder-decoder structure, based on a hierarchy of denoising autoencoders

Training Resolution

The final training resolution was effectively 512x512

Tokenization and Max Tokens

Uses CLIP tokenizer
Max tokens per caption for OT: 75

LoRA Full Set of Blocks / Layer Keys A working example of a custom layer set for SD1.5 LoRA training is:

down_blocks.1.attentions.0,down_blocks.1.attentions.1,down_blocks.2.attentions.0,down_blocks.2.attentions.1,mid_block.attentions.0

The complete set of blocks for SD1.5 includes can be referenced here or here

VAE Compression

Compression factor: 8x8 (8 times per dimension)
VAE trained on 256px x 256px resolution
Number of channels: 4

Paper: https://arxiv.org/pdf/2112.10752

Stable Diffusion XL (SDXL)

Base Architecture

SDXL uses an enhanced UNet architecture, significantly larger than SD1.5.

Training Resolution

SDXL is trained at higher resolutions, effectively 1024x1024.

Tokenization and Max Tokens

Uses two CLIP text encoders (CLIP ViT-L & OpenCLIP ViT-bigG)
Max tokens: 75 again in OneTrainer

VAE Compression

Compression factor: 8x8 (8 times per dimension)
VAE trained on 256px x 256px resolution
Uses the same VAE model as SD1.5, but trained with larger batch size and EMA enabled

Paper: https://arxiv.org/pdf/2307.01952

FLUX

Placeholder. Cant find much info, nor papers.

Training Resolution

Unknown, at least same or higher than SDXL

Tokenization and Max Tokens

Same as SDXL, 75 tokens max in OneTrainer, anything larger gets truncated.
CLIP L/14 and T5-v1

LoRA Full Set of Blocks / Layer Keys

Flux uses the following LoRA layers:

[
    "down_blocks.0.attentions.0.transformer_blocks.0.attn1",
    "down_blocks.0.attentions.0.transformer_blocks.0.attn2",
    "down_blocks.0.attentions.1.transformer_blocks.0.attn1",
    "down_blocks.0.attentions.1.transformer_blocks.0.attn2",
    "down_blocks.1.attentions.0.transformer_blocks.0.attn1",
    "down_blocks.1.attentions.0.transformer_blocks.0.attn2",
    "down_blocks.1.attentions.1.transformer_blocks.0.attn1",
    "down_blocks.1.attentions.1.transformer_blocks.0.attn2",
    "down_blocks.2.attentions.0.transformer_blocks.0.attn1",
    "down_blocks.2.attentions.0.transformer_blocks.0.attn2",
    "down_blocks.2.attentions.1.transformer_blocks.0.attn1",
    "down_blocks.2.attentions.1.transformer_blocks.0.attn2",
    "up_blocks.1.attentions.0.transformer_blocks.0.attn1",
    "up_blocks.1.attentions.0.transformer_blocks.0.attn2",
    "up_blocks.1.attentions.1.transformer_blocks.0.attn1",
    "up_blocks.1.attentions.1.transformer_blocks.0.attn2",
    "up_blocks.1.attentions.2.transformer_blocks.0.attn1",
    "up_blocks.1.attentions.2.transformer_blocks.0.attn2",
    "up_blocks.2.attentions.0.transformer_blocks.0.attn1",
    "up_blocks.2.attentions.0.transformer_blocks.0.attn2",
    "up_blocks.2.attentions.1.transformer_blocks.0.attn1",
    "up_blocks.2.attentions.1.transformer_blocks.0.attn2",
    "up_blocks.2.attentions.2.transformer_blocks.0.attn1",
    "up_blocks.2.attentions.2.transformer_blocks.0.attn2",
    "up_blocks.3.attentions.0.transformer_blocks.0.attn1",
    "up_blocks.3.attentions.0.transformer_blocks.0.attn2",
    "up_blocks.3.attentions.1.transformer_blocks.0.attn1",
    "up_blocks.3.attentions.1.transformer_blocks.0.attn2",
    "up_blocks.3.attentions.2.transformer_blocks.0.attn1",
    "up_blocks.3.attentions.2.transformer_blocks.0.attn2",
    "mid_block.attentions.0.transformer_blocks.0.attn1",
    "mid_block.attentions.0.transformer_blocks.0.attn2"
]

Full layers can be seen here

Overview

Home

Overview

Learning

Training

Getting Started

The Program - Tab Explanation

General

Model

Data

Concepts

Training

Optimizers

Custom Scheduler

Sampling

Backup and Saving

Tools

Additional Embeddings

Cloud

Embedding

Lora

More info

Infos, Guides and Lessons Learnt

Misc Info

Diffusion Models

Guides

One Trainer March 2024 Guide

Run One Trainer on Runpod

Other Tools - Helpful Links

Lessons Learnt

Frequently Asked Questions

Lessons Learnt and Tutorials

For Developers

Dev Corner

Developing on Clouds

Quick Start for Developers

CLI Training

Docker Image

Embedding Training

Project Structure

RAM Offloading

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diffusion Models Overview

Model-Specific Technical Information for Diffusion Models in OneTrainer

SD1.5

Base Architecture

Training Resolution

Stable Diffusion XL (SDXL)

Base Architecture

Training Resolution

Tokenization and Max Tokens

VAE Compression

FLUX

Training Resolution

Tokenization and Max Tokens

LoRA Full Set of Blocks / Layer Keys

Overview

Training

More info

For Developers

Clone this wiki locally