[FSDP] Enable loading prequantized weights with bf16/fp16/fp32 quant_storage #1295

matthewdouglas · 2024-07-29T14:15:28Z

This is a companion PR for huggingface/transformers#32276 to allow us to load prequantized weights with alternate storage. We keep track of metadata we need the same way we would with Params4bit.__new__ after PR #970.

This works with models exported with a non-default quant_storage such as this one in NF4 with BF16 storage.

@Titus-von-Koeller
@winglian

… type for FSDP

matthewdouglas added 2 commits July 29, 2024 10:07

Enable loading prequantized weights with bf16/fp16/fp32 quant_storage…

87f88af

… type for FSDP

Merge branch 'main' into fsdp-load-prequantized

a96d2f0

matthewdouglas merged commit 3a6911f into bitsandbytes-foundation:main Jul 29, 2024
22 checks passed

matthewdouglas deleted the fsdp-load-prequantized branch July 29, 2024 15:11

dmitrii-palisaderesearch mentioned this pull request Aug 7, 2024

Crash running FSDP on BF16-prequantized models #1310

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FSDP] Enable loading prequantized weights with bf16/fp16/fp32 quant_storage #1295

[FSDP] Enable loading prequantized weights with bf16/fp16/fp32 quant_storage #1295

matthewdouglas commented Jul 29, 2024

[FSDP] Enable loading prequantized weights with bf16/fp16/fp32 quant_storage #1295

[FSDP] Enable loading prequantized weights with bf16/fp16/fp32 quant_storage #1295

Conversation

matthewdouglas commented Jul 29, 2024