Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Switching to another model and then back leads to different image (SDXL) #12619

Closed
1 task done
djdookie opened this issue Aug 17, 2023 · 47 comments
Closed
1 task done
Labels
bug Report of a confirmed bug

Comments

@djdookie
Copy link

djdookie commented Aug 17, 2023

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What happened?

If I generate an image with sdxl_base_1.0_0.9vae with one self trained LoRa (with strength 0.8),
then switch the model to another custom SDXL1.0 model, generate another image with the exact same prompt,
and then switch back to the first model and generate an image with the exact same parameters again, it looks different/broken.

This only happens if I use a LoRa.

When I completely restart the webUI I can generate the correct first image again.

So for me it seems that something is not unloaded/replaced in memory correctly.

Edit: It looks like the LoRa is applied with strength 1.0 (instead of 0.8) after switching model and switching back.
And it is always applied even if I set it to strength 0.0 or remove it from prompt!

Steps to reproduce the problem

  1. Go to ....
  2. Press ....
  3. ...

What should have happened?

the last image should be the same as the first

Version or Commit where the problem happens

v1.5.1, also tried current dev branch build, same issue

What Python version are you running on ?

Python 3.11.x (above, no supported yet)

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

Cross attention optimization

sdp

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

no

List of extensions

openpose-editor, sd-webui-controlnet, sd-webui-openpose-editor, sd-webui-refiner, sd-webui-roop-nsfw, ultimate-upscale-for-automatic1111

Console logs

nothing relevant

Additional information

No response

@djdookie djdookie added the bug-report Report of a bug, yet to be confirmed label Aug 17, 2023
@w-e-w
Copy link
Collaborator

w-e-w commented Aug 17, 2023

cannot reproduce issue sdxl 1.0 lora 0.8 switch back to sdxl 1.0 lora 0.8 are identical

sdxl 1.0 lora 1.0 sdxl 1.0 lora 0.8 other xl model lora 0.8 switch back to sdxl 1.0 lora 0.8
20230817-131523-451281-1028715279-masterpiece high quality 8k beautiful lighting 1girl solo green eyes medium hair white shirt sweater vest black vest blue neckti 20230817-131538-178935-1028715279-masterpiece high quality 8k beautiful lighting 1girl solo green eyes medium hair white shirt sweater vest black vest blue neckti 20230817-131559-443582-1028715279-masterpiece high quality 8k beautiful lighting 1girl solo green eyes medium hair white shirt sweater vest black vest blue neckti 20230817-131621-062188-1028715279-masterpiece high quality 8k beautiful lighting 1girl solo green eyes medium hair white shirt sweater vest black vest blue neckti
2023-08-17.13_15_10_906.chrome.mp4

@w-e-w w-e-w added the cannot-reproduce I can't reproduce this, so I can't fix it. Add steps for reproduction and remove this tag. label Aug 17, 2023
@djdookie
Copy link
Author

djdookie commented Aug 17, 2023

Thanks for testing.
Could this be associated with my python 3.11.x version?
Or the selftrained lora itself (used kohya_ss repo)? Or the model I switch to (dynavisionXL)?

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 17, 2023

I would try to disable third party extensions and test again
I think say that the chance of this being caused by a model should be low (assuming that the models is trained correctly)
if you want you can upload your model / give me links so I can test it with your model see if it is the cause
but I think most likely it's either caused by some extension or you accidentally made a human error

@djdookie
Copy link
Author

djdookie commented Aug 17, 2023 via email

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 17, 2023

when you have time also test with a completely new install
with every setting aside from the models as default

@TomTomGit86
Copy link

I'm having this same issue too! Really frustrating when the image you want to reproduce or go back to and tweak the prompt a little to slight change it is now yielding completely different results.
I have had similar thoughts such that the problem is like it holding on the the previous model data when switching between models.
Also the Lora strength seems to have changed or disappeared.
Seems funny that I have used the model DynavisionXL also but wasn't the model Im currently using and experiencing the issues.

I'm now using model:https://civitai.com/models/119279
And https://civitai.com/models/123307/sdvn7-realartxl
And
Then I also created a merge of these to as I thought the desirable images had came about as an accidental bugged merging of the two models and I'm getting some good images with the merged model but then still having the issue where I go back to a image/seed/config to reproduce only to find a completely different output.

I'm using a Lora:
https://civitai.com/models/129533?modelVersionId=141997

I tend to create a txt2img then having found a good starting image I use loopback wave script in img2img to create a set of image frames to animate with.
Sometimes the set of images aren't as id wanted so I go back to the txt2img phase to create a new starting image, and tweak some negative to eliminate some of the undesirable resulting img2img frames only to have out of line results. So I revert back to exactly the same promt seed etc to check I'm not going mad and there it is two different images created with the same promt seed and identical settings

Hope we can get to the bottom of why this maybe happening.

I'm running rtx2070 maxq (8gbVram)
And 16gb sys ram.
I bought upgrade of 64gb sys ram which should arrive today as I noticed I'm allways at the limit.

And I've never been able to load the base sdxl model successfully it seems to out of memory error/crash.

I can only seem to load custom sdxl models.

I will do some more testing playing around and post any more updates here.

Likewise if you need any more info data from me to help fix this issue I'm happy to help.

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

make sure that what you're seeing is not caused by SDP
as SDP is not deterministic that results will not be the same
but generally speaking the difference to a human size is barely noticeable but it's not Pixel Perfect
however under certain conditions the variance can be amplified

so if what you're saying is SDP not deterministic then it's not a bug

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

make sure that what you're seeing is not caused by SDP as SDP is not deterministic that results will not be the same but generally speaking the difference to a human size is barely noticeable but it's not Pixel Perfect however under certain conditions the variance can be amplified

so if what you're saying is SDP not deterministic then it's not a bug

--opt-sdp-attention

I have started using the above argument as suggested for performance/time to generate images

But the difference in the images are greatly noticeable
Like the Lora is missing or changed weight I will upload some examples images with exactly the same generation data when I get back home so you can see what you think yourself, I may upload model and images to Google drive so maybe others could test gen the same and see what outputs they get I plan on trying a fresh install and rerunning the generations in question again to see what results are.

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

I don't think this is a SDP problem, since this leads to only minor changes in the generated image.
After restarting the webUI I get back to mostly the same picture.

This is what I am talking about:

1. Initial image with SDXL base and my lora:0.8: 2. Switch to dynaVisionXL and my lora:0.8: 3. Switch back to SDXL base and my lora:0.8:
1 2 3

I tried some more things now:

  • reverted back to recommended python version 3.6.10.
  • disabled all extra extensions.
  • reinstalled webui
  • use sdxl_vae_sp16_fix

Problem still persists, 100% reproducible with my own LoRa!

But I tested some other LoRas from civitai and wasn't able to reproduce the issue with other LoRas yet.
What could be wrong with my self-trained LoRa to trigger that problem?

I tried the same in vladmandics webui fork and was able to get black images this way (in the VAE decode phase), even with the sdxl_vae_sp16_fix. (I trained my LoRa with no half vae parameter to prevent NaNs)
There I triggered this error, possibly this could help to narrow down the issue?

grafik

It's possible I am doing something wrong in my LoRa training, but I don't know what this could be to trigger such issues in the webUI.

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

02279-85899033-(Art3mis_1 5), 1girl, full length photo, facing camera, dynamic pose, lora_Art3mis_sdxlr_0 5, Adam Rex, punk, ready player one 02320-85899033-(Art3mis_1 5), 1girl, full length photo, facing camera, dynamic pose, lora_Art3mis_sdxlr_0 5, Adam Rex, punk, ready player one

hi again here are the two image examples using the exact same promt/model/settings etc and as you can see rather more diffrent than i would perhaps expect from what i now know can be the case when implementing argumenta like xformers.
im thinking more so now this is perhaps an issue with the lora its self. i cant say 100% but i dont think furing the generations ibswitched models at any point i did however change the lora wieght and possibly restarted auto1111.
if ibrember correctly it was only after resarting auto1111 did i yield the much diffrent image, where as prior i could replicate it .

@TomTomGit86
Copy link

@djdookie
You could see if you can recreate the problem with the Lora I'm using that might then point to the problem being with the Lora's
https://civitai.com/models/129533?modelVersionId=141997
V1. 0 of the Lora
I have just seen there's an update to the Lora I'm using also now maybe because of the user noticed the same issue we've had possibly leave a comment on civ and see if the author has any tips/help for you making your Lora's

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

damn I see it

20230819-020803-781104-85899033-Art3mis 1 5 1girl full length photo facing camera dynamic pose lora Art3mis sdxlr resized 256 bf16 0 5 Adam -  image_hash 20230819-020918-460115-85899033-Art3mis 1 5 1girl full length photo facing camera dynamic pose lora Art3mis sdxlr resized 256 bf16 0 5 Adam

@w-e-w w-e-w added bug Report of a confirmed bug and removed cannot-reproduce I can't reproduce this, so I can't fix it. Add steps for reproduction and remove this tag. labels Aug 18, 2023
@djdookie
Copy link
Author

I made progress and I found a solution how to fix this!!

I did more tests: I upgraded my graphic cards driver, tried another browser, and different settings.
Since I also played around with the newest refiner settings, I changed two options to boost the in-process refiner swapping:

(1) Maximum number of checkpoints loaded at the same time: 1 -> 2
(2) VAE Checkpoints to cache in RAM: 0 -> 1

Step (1) made the issue disappear!

@djdookie
Copy link
Author

So I guess it has todo with the kind of how the webUI loads/reloads the checkpoints.
The model I switch to (dynaVisionXL) has a bakedVAE which could matter.

@djdookie
Copy link
Author

@w-e-w @TomTomGit86 Can you tell me what is your setting at "Maximum number of checkpoints loaded at the same time"?

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

Maximum number of checkpoints loaded at the same time 1

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

1 and 3 should be identical but it is not
20230819-023348-570755-85899033-Art3mis 1 5 1girl full length photo facing camera dynamic pose lora Art3mis sdxlr resized 256 bf16 0 5 Adam

@TomTomGit86
Copy link

its not anywhere as diffrent as my experiences tho, where do i find the setting for maximum number of checkpoints, am i beign stupid im assuming you mean in auto1111 ive opened the all setting view and searched for it but cant seem to find it but maybe not looking properly excited at the fact we mught have a fix! lol

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

where do i find the setting for maximum number of checkpoints

Settings -> Stable Diffusion

grafik

@TomTomGit86
Copy link

image

erm myne looks like this!

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

Set it to 2, click apply settings, restart webUI and see if the issue is still there please!

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

set checkpoints to cache in ram?, i dont have the option maximum number of checkpoints lodaed?

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

irrc Maximum number of checkpoints loaded at the same time is a new feature in dev branch

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

Oh yeah just recognized that. ^^
But yeah try "checkpoints to cache in RAM" then.

@TomTomGit86
Copy link

ah i was going to try out the new dev branch to see if that might the the situation, do you know the cmd for dev branch?

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

#12227

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

ah i was going to try out the new dev branch to see if that might the the situation, do you know the cmd for dev branch?

git switch dev
git switch master to go back

@djdookie
Copy link
Author

Try your current version with "checkpoints to cache in RAM" = 2 first pls.

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

okay, erm cd/?

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

Ok I tested your linked Art3mis LoRa in both versions.

In short:
If I have maximum number of checkpoints loaded at the same time = 1 I have the same issue on both lora versions.
If I set it to 2, the issue is gone.

Even the image I get from the model I swap to (dynaVision XL in my tests) looks different.

Conclusion: It seems if I load another model from disk and not cache, the LoRa weight get corrupted somehow. If I swap back to the first model, again loading from disk and not cache, the LoRa weights are still corrupted.

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

yeah having the checkpoints to cache in RAM" = 2
seems to have fixed the issue;

image 1 checkpoint 1 image 2 checkpoint 2 image 3 checkpoint 1 (checkpoints to cache in RAM" = 0) image 4 checkpoint 1 (checkpoints to cache in RAM" = 2)
02343-136470668-(Art3mis_1 5), 1girl, full length photo, facing camera, dynamic pose, lora_art3mis_1, Adam Rex, punk, ready player one, wearin 02345-136470668-(Art3mis_1 5), 1girl, full length photo, facing camera, dynamic pose, lora_art3mis_1, Adam Rex, punk, ready player one, wearin 02346-136470668-(Art3mis_1 5), 1girl, full length photo, facing camera, dynamic pose, lora_art3mis_1, Adam Rex, punk, ready player one, wearin 02347-136470668-(Art3mis_1 5), 1girl, full length photo, facing camera, dynamic pose, lora_art3mis_1, Adam Rex, punk, ready player one, wearin

Thank you so much for helping with this issue was gonna drive me round the bend you can spend a fair bit of time gening the image you want and then knowing it has disaperead from recreation, tweaking can be really frustrating! kudos @w-e-w

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

Here are the pictures of my latest tests for Artemis LoRa v1.0 and 2.0:
(you can see the differences best by overlaying them, i.e. by opening them in multiple tabs and switching between them)

image 1 = SDXL base
image 2 = dynaVision XL
image 3 = swap back to SDXL base

Art3mis LoRa v1.0:
Maximum number of checkpoints loaded at the same time = 1 (has issues):

09855-994095053-lora_Art3mis_sdxlr_resized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at 09856-994095053-lora_Art3mis_sdxlr_resized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at 09857-994095053-lora_Art3mis_sdxlr_resized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at

Maximum number of checkpoints loaded at the same time = 2 (issues fixed):

09861-994095053-lora_Art3mis_sdxlr_resized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at 09862-994095053-lora_Art3mis_sdxlr_resized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at 09863-994095053-lora_Art3mis_sdxlr_resized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at

Art3mis LoRa v2.0:
Maximum number of checkpoints loaded at the same time = 1 (has issues):

09858-994095053-lora_art3mis_sdxl_rsized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at ca 09859-994095053-lora_art3mis_sdxl_rsized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at ca 09860-994095053-lora_art3mis_sdxl_rsized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at ca

Maximum number of checkpoints loaded at the same time = 2 (issues fixed):

09864-994095053-lora_art3mis_sdxl_rsized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at ca 09865-994095053-lora_art3mis_sdxl_rsized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at ca 09866-994095053-lora_art3mis_sdxl_rsized_256_bf16_1 art3mis, wearing a shiny red evening dress, choker, with  a gun in her hand pointing at ca

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

pls put img in tabe to save space

| a | b | c |
| - | - | - |
| aaa | bbb | ccc |
a b c
aaa bbb ccc

or

| a | b | c |
| - | - | - |
a b c

substitute ABC for images

@TomTomGit86
Copy link

yeah would you advise staying on the master branch or switching to the dev branch?
p.s someone needs to tell her she needs to point the gun the other way also! LOL

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

pls put img in tabe to save space

| a | b | c |
| - | - | - |
| aaa | bbb | ccc |

a b c
aaa bbb ccc
or

| a | b | c |
| - | - | - |

a b c
substitute ABC for images

yeah soz can we do that natively here? or you do it pre posting?
oh i think i get it now if we use the text you post and copy and paste the images in place of a,b,c the post will display them in a grid

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 18, 2023

you can preview before you post, you can also edit the post after words

aaaa I'll do it

@TomTomGit86
Copy link

TomTomGit86 commented Aug 18, 2023

lol i did it mate :-) or you did and I thought I'd done it. Agree tho looks better truncated

@TomTomGit86
Copy link

Would you recommend setting vae to cache = 2 also?

@djdookie
Copy link
Author

Would you recommend setting vae to cache = 2 also?

I think this can save some time if you switch VAEs a lot and if you can spare the RAM.
I have set it now to 1, 0 is default, so it's not getting cached at all.
It's your own preference. But this doesnt seem to influence the discussed issue at all.

@djdookie
Copy link
Author

I admit you can see the issue way clearer on my LoRa. Art3mis LoRa is good enough to demonstrate the issue and it is publicly available.

@TomTomGit86
Copy link

I admit you can see the issue way clearer on my LoRa. Art3mis LoRa is good enough to demonstrate the issue and it is publicly available.

If you look at my examples it was wildly different.
But thankfully we now have the solution

@djdookie
Copy link
Author

djdookie commented Aug 18, 2023

I'd call it a workaround I discovered after identifying the issue and narrowing it down as much as possible.

But for a real fix it's the coders turn now. ;)

@catboxanon catboxanon removed the bug-report Report of a bug, yet to be confirmed label Aug 19, 2023
@djdookie
Copy link
Author

Any progress on this?

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 25, 2023

test again in RC or dev, irrc the issue was fix the day after the post
20230825-213824-378023-85899033-Art3mis 1 5 1girl full length photo facing camera dynamic pose lora Art3mis sdxlr resized 256 bf16 0 5 Adam - 6bdf5cbb6e640a0f4b1ffd81ca812c5ef18c1eedbac82effb3241fb165819292

@catboxanon
Copy link
Collaborator

For reference, this was resolved by 9d2299e (was also done in #12665 but that aforementioned commit supercedes it). I'll go ahead and close this but feel free to re-open if it still seems to be a problem.

@df2df
Copy link

df2df commented Aug 31, 2023

Is there a way to reproduce what this bug was doing via model merges? Because i actually like many of the results I was getting after triggering this issue...

@yoyoinneverland
Copy link

Please see #13516

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Report of a confirmed bug
Projects
None yet
Development

No branches or pull requests

6 participants