Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix llama lm_head load wrong weight error #4259

Closed
wants to merge 1 commit into from

Conversation

baodii
Copy link
Contributor

@baodii baodii commented Sep 5, 2023

Hi,
I have found that when we do AutoTP shard loading, if there are more than one checkpoint file, we will get a wrong lm_head weight in llama model.

Because when we replace module we will use embedding_weight as lm_head.weight in advance. In other word, we use embedding_weight as lm_head.weight before we load lm_head.weight when we have more than one checkpoint file.

So I add is_last_cp flag to avoid this situation.

image image

@baodii
Copy link
Contributor Author

baodii commented Sep 5, 2023

@delock @Yejing-Lai

@dc3671
Copy link
Contributor

dc3671 commented Sep 5, 2023

I think it's kind of duplicated with #4206 . But I don't know which way is better.

@delock
Copy link
Collaborator

delock commented Sep 20, 2023

Hi @baodii @sywangyi can you converge these two PRs together? #4259 and #4206 ? These two PRs fix similiar issue. We need to have a single PR that fix all different cases.

@baodii baodii closed this Sep 20, 2023
@baodii
Copy link
Contributor Author

baodii commented Sep 20, 2023

same as #4206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants