Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert quantizer_gptq.py #2

Merged
merged 2 commits into from
Dec 4, 2024

Conversation

LRL-ModelCloud
Copy link

No description provided.

@LRL-ModelCloud LRL-ModelCloud changed the base branch from main to gptq December 3, 2024 08:26
@jiqing-feng jiqing-feng merged commit ef0fb56 into jiqing-feng:gptq Dec 4, 2024
@LRL-ModelCloud LRL-ModelCloud deleted the revert-quantizer-gptq branch December 4, 2024 07:30
jiqing-feng added a commit that referenced this pull request Jan 16, 2025
* gptqmodel

Signed-off-by: jiqing-feng <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* update readme

Signed-off-by: jiqing-feng <[email protected]>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* fix warning

Signed-off-by: jiqing-feng <[email protected]>

* fix version check

Signed-off-by: jiqing-feng <[email protected]>

* revert unrelated changes

Signed-off-by: jiqing-feng <[email protected]>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <[email protected]>

* fix requires gptq

Signed-off-by: jiqing-feng <[email protected]>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* fix format again

Signed-off-by: jiqing-feng <[email protected]>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <[email protected]>

* fix memory check

Signed-off-by: jiqing-feng <[email protected]>

* fix device mismatch

Signed-off-by: jiqing-feng <[email protected]>

* fix result check

Signed-off-by: jiqing-feng <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* update tests

Signed-off-by: jiqing-feng <[email protected]>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <[email protected]>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <[email protected]>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <[email protected]>

---------

Signed-off-by: jiqing-feng <[email protected]>
Co-authored-by: LRL-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: LRL <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Mohamed Mekkouri <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
jiqing-feng added a commit that referenced this pull request Feb 14, 2025
* Resolve vptq conflict

* Rename spqr package to spqr_quant

* Get rid of aqlm mention

* Start working on tests

* Resolve ruff code checks

* Ruff format

* Isort

* Test updates

* Add gpu tag

* Rename to modules_to_not_convert

* Config update

* Docs and config update

* Docs and config update

* Update to update_torch_dtype

* spqr config parameter validation

* Ruff update

* Apply ruff fixes

* Test fixes

* Ruff update

* Mark tests as @slow again; Ruff; Docstring update

* Ruff

* Remove absolute path

* Resolve typo

* Remove redundandt log

* Check accelerate/spqr availability

* Ruff fix

* Check if the config contains proper shapes

* Ruff test

* Documentation update

* overview update

* Ruff checks

* Ruff code quality

* Make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Steven Liu <[email protected]>

* Update spqr.md

* Enable gptqmodel (huggingface#35012)

* gptqmodel

Signed-off-by: jiqing-feng <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* update readme

Signed-off-by: jiqing-feng <[email protected]>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* fix warning

Signed-off-by: jiqing-feng <[email protected]>

* fix version check

Signed-off-by: jiqing-feng <[email protected]>

* revert unrelated changes

Signed-off-by: jiqing-feng <[email protected]>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <[email protected]>

* fix requires gptq

Signed-off-by: jiqing-feng <[email protected]>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* fix format again

Signed-off-by: jiqing-feng <[email protected]>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (#7)

* fix format and tests

Signed-off-by: jiqing-feng <[email protected]>

* fix memory check

Signed-off-by: jiqing-feng <[email protected]>

* fix device mismatch

Signed-off-by: jiqing-feng <[email protected]>

* fix result check

Signed-off-by: jiqing-feng <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* update tests

Signed-off-by: jiqing-feng <[email protected]>

* review: update docs (#10)

* review: update docs (#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <[email protected]>

* update document (#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <[email protected]>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <[email protected]>

---------

Signed-off-by: jiqing-feng <[email protected]>
Co-authored-by: LRL-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: LRL <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Mohamed Mekkouri <[email protected]>
Co-authored-by: Steven Liu <[email protected]>

* Fix : Nemotron Processor in GGUF conversion (huggingface#35708)

* fixing nemotron processor

* make style

* Update docs/source/en/quantization/spqr.md

Co-authored-by: Arthur <[email protected]>

* Add missing TOC to doc

---------

Signed-off-by: jiqing-feng <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: jiqing-feng <[email protected]>
Co-authored-by: LRL-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: LRL <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Mohamed Mekkouri <[email protected]>
Co-authored-by: Arthur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants