Make int8 dynamic quant in autoquant serializable #1484

jerryzh168 · 2025-01-02T23:17:23Z

Summary:
lambda function is not supported for serialization, so we need to reuse the non-lambda functions that already supports serialization: https://github.com/pytorch/ao/blob/00a8d290aab354985fce8c880e1fded22bc48e30/torchao/quantization/quant_api.py#L1263C5-L1268

Note this PR only supports int8 dynamic quant, will need to test and support float8 separately (in H100 machines)

Test Plan:
Tested locally with transformer push_to_hub: https://huggingface.co/jerryzh168/llama3-8b-autoquant/tree/main

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: lambda function is not supported for serialization, so we need to reuse the non-lambda functions that already supports serialization: https://github.com/pytorch/ao/blob/00a8d290aab354985fce8c880e1fded22bc48e30/torchao/quantization/quant_api.py#L1263C5-L1268 Note this PR only supports int8 dynamic quant, will need to test and support float8 separately (in H100 machines) Test Plan: Tested locally with transformer push_to_hub: https://huggingface.co/jerryzh168/llama3-8b-autoquant/tree/main Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2025-01-02T23:17:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1484

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 3 Pending

As of commit 9025626 with merge base 00a8d29 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

* Make int8 dynamic quant in autoquant serializable Summary: lambda function is not supported for serialization, so we need to reuse the non-lambda functions that already supports serialization: https://github.com/pytorch/ao/blob/00a8d290aab354985fce8c880e1fded22bc48e30/torchao/quantization/quant_api.py#L1263C5-L1268 Note this PR only supports int8 dynamic quant, will need to test and support float8 separately (in H100 machines) Test Plan: Tested locally with transformer push_to_hub: https://huggingface.co/jerryzh168/llama3-8b-autoquant/tree/main Reviewers: Subscribers: Tasks: Tags: * fix * fixes * fix

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 2, 2025

jerryzh168 requested review from cpuhrsch, drisspg and HDCharles January 2, 2025 23:17

fix

ca3c540

drisspg approved these changes Jan 2, 2025

View reviewed changes

jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jan 2, 2025

jerryzh168 added 2 commits January 2, 2025 16:20

fixes

a463397

fix

9025626

jerryzh168 merged commit 3f36c78 into pytorch:main Jan 3, 2025
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make int8 dynamic quant in autoquant serializable #1484

Make int8 dynamic quant in autoquant serializable #1484

jerryzh168 commented Jan 2, 2025

pytorch-bot bot commented Jan 2, 2025 •

edited

Loading

Make int8 dynamic quant in autoquant serializable #1484

Make int8 dynamic quant in autoquant serializable #1484

Conversation

jerryzh168 commented Jan 2, 2025

pytorch-bot bot commented Jan 2, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1484

⏳ No Failures, 3 Pending

pytorch-bot bot commented Jan 2, 2025 •

edited

Loading