Paper: https://arxiv.org/pdf/2303.11073.pdf
This is an inofficial implementation and is based, as the paper, on the google/ddpm-ema-celebahq-256 model. Not all the details are correct, especially for section 2. All code can be found in h-space-directions.ipynb.
The results below show the changes caused by modifying the semantic latent space of diffusion model in the bottom on the unet, also called the h-space.
The center image is the unmodified genereted image, from which the principal components (PC) are generated. In each row we see the change that the corresponding PC causes, when being added/subtracted to the h-space.
Here, the changes are generated by searching for the directions which cause the largest change in the output. The first row shows the image from which the directions are generated. The second and third row show different randomly generated images to which the same direction from the first image is applied.
One can also find meaningful directions in the h-space in a supervised manner by collecting two groups of examples: One where the desired attribute exist and one where it doesn't. The direction is then the difference of the h-space means between the base and modified group. The example below is for "smiling".