torch.compile() on quantized model: No attribute "meta" #148072

Whadup · 2025-02-27T07:51:26Z

During evaluation of a compiled and quanzited model, I obtain the error "no attribute "meta"" in the following line:

pytorch/torch/_inductor/fx_passes/quantization.py

Line 1074 in fd43c36

x = match.kwargs["x"].meta["val"]

I propose the following change:

-       x = match.kwargs["x"].meta["val"] if hasattr(match.kwargs["x"], 'meta') else match.kwargs["x"]
-       weight = match.kwargs["weight"].meta["val"] if hasattr(match.kwargs["weight"], 'meta') else match.kwargs["weight"]
-       scales = match.kwargs["scales"].meta["val"] if hasattr(match.kwargs["scales"], 'meta') else match.kwargs["scales"]
+       x = match.kwargs["x"]
+       if hasattr(x, 'meta'):
+           x = x.meta["val"]
+       weight = match.kwargs["weight"]
+       if hasattr(weight, 'meta'):
+           weight = weight.meta["val"]
+       scales = match.kwargs["scales"]
+       if hasattr(scales, 'meta'):
+           scales = scales.meta["val"]

cc @chauhang @penguinwu

Here is an example to reproduce the behavior on a machine with an A100 GPU.
Requirements: torch, transformers, peft

from transformers import AutoModelForCausalLM
import peft
import torch

model = AutoModelForCausalLM.from_pretrained(
    "casperhansen/llama-3-8b-instruct-awq",
    device_map="auto",
)
model = peft.get_peft_model(
    model,
    peft.LoraConfig(
        task_type="CAUSAL_LM"
    )
)

torch._dynamo.config.cache_size_limit = 1024
for i, layer in enumerate(model.base_model.model.model.layers):
    model.base_model.model.model.layers[i] = torch.compile(layer)

with torch.amp.autocast("cuda"):
    model(
        input_ids = torch.tensor([[0, 1, 2]]).cuda(),
        attention_mask = torch.tensor([[1, 1, 1]]).cuda()
    )

Output:

torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AttributeError: 'float' object has no attribute 'meta'

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

The text was updated successfully, but these errors were encountered:

desertfire · 2025-03-04T02:43:23Z

Can you provide instruction for a reproduction?

Whadup · 2025-03-04T09:10:57Z

I provided an example in the edit.
I tried to extract the essence of my original training script, but it is still rather specific: Without LORA (peft) it works. Without mixed precision (amp.autocast) it works. Without the quantized based model (AWQ), it works. The combination triggers the error.

Using the fix I proposed in the original message, it works.

leslie-fang-intel · 2025-03-07T02:40:52Z

Hi @Whadup, seems you already have the PR. Assign it to you. If not, please let me know. Thanks

zou3519 added the oncall: pt2 label Feb 27, 2025

Whadup mentioned this issue Mar 4, 2025

Do not crash when compiling quantized LORA models #148435

Closed

leslie-fang-intel assigned Whadup Mar 7, 2025

pytorchmergebot closed this as completed in 179b7a0 Mar 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.compile() on quantized model: No attribute "meta" #148072

torch.compile() on quantized model: No attribute "meta" #148072

Whadup commented Feb 27, 2025 •

edited

Loading

desertfire commented Mar 4, 2025

Whadup commented Mar 4, 2025 •

edited

Loading

leslie-fang-intel commented Mar 7, 2025

torch.compile() on quantized model: No attribute "meta" #148072

torch.compile() on quantized model: No attribute "meta" #148072

Comments

Whadup commented Feb 27, 2025 • edited Loading

desertfire commented Mar 4, 2025

Whadup commented Mar 4, 2025 • edited Loading

leslie-fang-intel commented Mar 7, 2025

Whadup commented Feb 27, 2025 •

edited

Loading

Whadup commented Mar 4, 2025 •

edited

Loading