-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mimic adamw_torch_4bit
and have adamw_torch_8bit
#34893
Labels
Feature request
Request for a new feature
Comments
adamw_torch_4bit
and have adamw_torch_8bit
adamw_torch_4bit
and have adamw_torch_8bit
cc @muellerzr for deepspeed/accelerate! |
A PR for this would be great 🤗 cc @SunMarc |
Thanks! I will do that later. |
Feel free to add it ! Let me know if you need any help |
Thanks! I will firstly mimic the 4bit one and see whether it works. |
adamw_torch_4bit
and have adamw_torch_8bit
adamw_torch_4bit
and have adamw_torch_8bit
PR created: #34993 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Feature request
Hi thanks for the lib! Currently there is
adamw_torch_4bit
, but I hope to mimic it to have aadamw_torch_8bit
that uses 8bit torchao adamw.The reason is that, I would like to use deepspeed cpu offload for the optimizer, and also use 8bit adamw. However, the 8bit one in current hf transformers does not support cpu, so I need to use the torchao one.
Motivation
Your contribution
yes, willing to PR
The text was updated successfully, but these errors were encountered: