-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuning taesd but get dim and less saturated images #22
Comments
Another question is that since I only finetune the taesd decoder and leave the encoder freezed, should I also train the encoder of taesd with my datasets? |
|
Thank you so much for your helpful suggestions and quick reply! I observe the degradation of images even during training so I might again check my training loss (1.a). For the low-res MSE loss (2.), it seems like taesd decoder is treated as a conditional GAN, which is a generative model, so it may be unnecessary to leverage pixel-wise reconstruction loss. I will focus on the GAN part and try to improve it. I really appreciate your suggestion about how to locate the problem. Thank you! |
Hi madebyollin!
Thank you so much for your awesome work and kind reply.
Sorry to bother you again. While trying to finetune taesd decoder, I found that the images are getting a little dim and less saturated. I suspect the reason may be the different datasets or data processing strategies I've used. The example is shown below.
For datasets, I use the same laion ae dataset which I also find the images are less saturated since lots of images have a white background and a single object in the center. The images are resized to 512x512. I adopt the color augmentation you suggested, but the output is still less staturated than taesd. It would be kind if you could give some suggestions for dataset choice or data augmentation, will it help if I add more colorful datasets like [danbooru2021]?
Besides, my training strategy is to train taesd decoder with lpips loss and gan loss. Do you think which one matters more to enhance the output quality? Will the lpips loss affect the saturation of the images? If so, how about using a GAN loss only ?
It seems that taesd version 1.2 is trained based on the weights from the previous version. Could you please share some details about the finetuning (did you initialize discriminator from previous version? why did you remove the lpips loss in version 1.2?)
I sincerely appreciate your wonderful work and enthusiasm. It will be really great if you could provide me some suggestions for finetuning taesd. Many thanks to you.
(left taesd, right my finetuning version)
The text was updated successfully, but these errors were encountered: