Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fused mlp causes assertion error #179

Open
sgsdxzy opened this issue Apr 15, 2023 · 5 comments
Open

Fused mlp causes assertion error #179

sgsdxzy opened this issue Apr 15, 2023 · 5 comments

Comments

@sgsdxzy
Copy link
Contributor

sgsdxzy commented Apr 15, 2023

After c90adef, when fused_mlp is enabled, I got the following error:

python: /opt/conda/conda-bld/torchtriton_1677881345124/work/lib/Analysis/Allocation.cpp:42: std::pair<llvm::SmallVector<unsigned int>, llvm::SmallVector<unsigned int> > mlir::triton::getCvtOrder(const mlir::Attribute&, const mlir::Attribute&): Assertion `!(srcMmaLayout && dstMmaLayout) && "Unexpected mma -> mma layout conversion"' failed.
Aborted (core dumped)

My gpu is 2080 Ti, which as a Turing, so I think it's not the same as #174

@TitanSneaker
Copy link

TitanSneaker commented Apr 25, 2023

Same problem:

CUDA_VISIBLE_DEVICES=0 python llama_inference.py ./llama-hf/llama-7b --load llama7b-4bit-128g.pt --text "this is llama" --wbits 4 --groupsize 128
Loading model ...
Found 3 unique KN Linear values.
Warming up autotune cache ...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:30<00:00,  2.52s/it]
Found 1 unique fused mlp KN values.
Warming up autotune cache ...
  0%|                                                                                                                                                                                 | 0/12 [00:00<?, ?it/s]
python: /project/lib/Analysis/Allocation.cpp:42: std::pair<llvm::SmallVector<unsigned int>, llvm::SmallVector<unsigned int> > mlir::triton::getCvtOrder(const mlir::Attribute&, const mlir::Attribute&): Assertion `!(srcMmaLayout && dstMmaLayout) && "Unexpected mma -> mma layout conversion"' failed.
Aborted (core dumped)

@penlu
Copy link

penlu commented Apr 27, 2023

I experience the same problem (identical error message) running 5168950 on 2080 Ti. Disabling fused_mlp succeeds as a workaround for me.

@929359291
Copy link

I experience the same problem (identical error message) running 5168950 on 2080 Ti. Disabling fused_mlp succeeds as a workaround for me.

hi man,how Disabling fused_mlp? my system is centos.

@ereish64
Copy link

ereish64 commented Jun 1, 2023

I experience the same problem (identical error message) running 5168950 on 2080 Ti. Disabling fused_mlp succeeds as a workaround for me.

hi man,how Disabling fused_mlp? my system is centos.

At line 279 in llama.py, change fused_mlp=True in load_quant to fused_mlp=False

@shirley-wu
Copy link

Same problem. Disabling fused_mlp works for me. Note: use .pt file, not .safetensors; for some reason .safetensors still triggers the error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants