Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Disable Sparse Decompression for Dense Compressors #237

Merged
merged 3 commits into from
Jan 10, 2025

Conversation

rahul-tuli
Copy link
Member

@rahul-tuli rahul-tuli commented Jan 9, 2025

Problem

When the sparse compressor is set to "dense", sparse decompression is incorrectly triggered, causing uninitialized weights and downstream errors.
Example CI failure: [GitHub Actions Log](https://github.com/vllm-project/llm-compressor/actions/runs/12659596814/job/35326229412).

Solution

Added a condition to skip sparse decompression when the sparsity configuration format is "dense".

Testing

  • Verified against llm-compressor main commit: 03e21770.
  • Confirmed weights load correctly for dense compressors.
  • All CI workflows pass without regressions.

@rahul-tuli rahul-tuli changed the title Turn off sparse decompression when sparse compressor is dense Fix: Disable Sparse Decompression for Dense Compressors Jan 9, 2025
@rahul-tuli rahul-tuli marked this pull request as ready for review January 10, 2025 00:22
mgoin
mgoin previously approved these changes Jan 10, 2025
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks!

Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rahul-tuli rahul-tuli requested a review from kylesayrs January 10, 2025 14:20
@rahul-tuli rahul-tuli force-pushed the turn-off-sparse-decompression-when-dense branch from 51cee4e to cc4f78e Compare January 10, 2025 14:28
@dsikka dsikka merged commit 6fffbd7 into main Jan 10, 2025
1 check failed
@dsikka dsikka deleted the turn-off-sparse-decompression-when-dense branch January 10, 2025 15:51
dsikka added a commit to vllm-project/llm-compressor that referenced this pull request Jan 23, 2025
~~Contingent on merge of
huggingface/transformers#34719
~~ ^ has been merged not yet released ~~
^ has been released


Blocked on 
neuralmagic/compressed-tensors#237

SUMMARY:
* In multiple optimization tests, automatically decompress model if
provided as optimized model
* Fix recipe stage length
* Revive old code
* When running multiple optimizations (ex. oneshot then finetune,
oneshot and oneshot), the recipes needs to be added to the session using
`initialize_recipe`. Example here
https://github.com/vllm-project/llm-compressor/pull/971/files#diff-c9ae8b3ad24d13abeea5b649a5fd6d0b0925f5c9cc40220cbfbe21ae81242f8dR63-R65


TEST PLAN:
ran the test using transformers main
Must pass tests/llmcompressor/transformers/obcq/test_consecutive_runs.py

---------

Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Rahul Tuli <[email protected]>
rahul-tuli added a commit to vllm-project/llm-compressor that referenced this pull request Jan 28, 2025
~~Contingent on merge of
huggingface/transformers#34719
~~ ^ has been merged not yet released ~~
^ has been released

Blocked on
neuralmagic/compressed-tensors#237

SUMMARY:
* In multiple optimization tests, automatically decompress model if
provided as optimized model
* Fix recipe stage length
* Revive old code
* When running multiple optimizations (ex. oneshot then finetune,
oneshot and oneshot), the recipes needs to be added to the session using
`initialize_recipe`. Example here
https://github.com/vllm-project/llm-compressor/pull/971/files#diff-c9ae8b3ad24d13abeea5b649a5fd6d0b0925f5c9cc40220cbfbe21ae81242f8dR63-R65

TEST PLAN:
ran the test using transformers main
Must pass tests/llmcompressor/transformers/obcq/test_consecutive_runs.py

---------

Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Rahul Tuli <[email protected]>
Signed-off-by: Rahul Tuli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants