Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't duplicate frozen parameters during predict() #20851

Merged
merged 1 commit into from
Feb 4, 2025

Conversation

mattdangerw
Copy link
Member

On the Jax backend we were not using donate_argnums during predict(). This works when a model is mostly trainable, but when a model is mostly or all frozen, this will result in 2x the memory jump (which is why we use donate_argnums for fit and evaluate).

This change adds donate_argnums to the predict function to avoid the memory spike. But because this means all incoming state (including the trainable variables) will be deleted by jax, this means we need to sync the trainable variables state much like in fit and evaluate. An alternative would be to change the predict_step signature (so we could only donate non-trainable variables), but this would be a breaking change and confusing.

@codecov-commenter
Copy link

codecov-commenter commented Feb 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.25%. Comparing base (fc1b26d) to head (84a4897).
Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #20851      +/-   ##
==========================================
+ Coverage   82.04%   82.25%   +0.20%     
==========================================
  Files         559      559              
  Lines       52367    52374       +7     
  Branches     8096     8096              
==========================================
+ Hits        42964    43078     +114     
+ Misses       7427     7305     -122     
- Partials     1976     1991      +15     
Flag Coverage Δ
keras 82.06% <100.00%> (+0.20%) ⬆️
keras-jax 64.18% <100.00%> (-0.08%) ⬇️
keras-numpy 58.99% <0.00%> (+<0.01%) ⬆️
keras-openvino 32.55% <0.00%> (+2.73%) ⬆️
keras-tensorflow 64.82% <0.00%> (+<0.01%) ⬆️
keras-torch 64.15% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

On the Jax backend we were not using donate_argnums during predict.
This works when a model is mostly trainable, but when a model is mostly
or all frozen, this will result in 2x the memory jump (which is why
we use donate_argnums for fit and evaluate).

This change adds donate_argnums to the predict function to avoid the
memory spike. But because this means all incoming state (including the
trainable variables) will be deleted by jax, this means we need to
sync the trainable variables state much like in fit and evaluate. An
alternative would be to change the predict_step signature (so we
could only donate non-trainable variables), but this would be a
breaking change and confusing.
@mattdangerw mattdangerw force-pushed the predict-memory-use-fix branch from e4eefb4 to 84a4897 Compare February 3, 2025 19:36
@mattdangerw
Copy link
Member Author

Notably this will come up for lora + transformers.

gemma = keras_hub.CausalLM.from_preset("gemma...")
gemma.backbone.enable_lora(4)
gemma.fit(...)
gemma.predict(example) # double optimal memory usage do to doubled frozen params

Copy link
Collaborator

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Feb 4, 2025
@fchollet fchollet merged commit 3b0d4de into keras-team:master Feb 4, 2025
9 of 10 checks passed
@google-ml-butler google-ml-butler bot removed awaiting review ready to pull Ready to be merged into the codebase kokoro:force-run labels Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants