Skip to content

Commit

Permalink
commont on todo
Browse files Browse the repository at this point in the history
  • Loading branch information
Your Name committed May 31, 2024
1 parent ca40d60 commit 22a9f82
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions vllm/model_executor/layers/fused_moe/ampere_fp8_fused_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,8 @@ def fused_moe_kernel(
).to(tl.float16)
b = tl.load(b_ptrs, mask=offs_k[:, None] < K - k * BLOCK_SIZE_K, other=0.0)

# todo(wenxh): there is a bug in triton 2.2/2.3 that only "=l" works, "=r"
# will result error in llvm check(low level bug).
b = tl.inline_asm_elementwise(
asm = "{ \n"
".reg .b32 a<2>, b<2>; \n" # if input = 0xf1f2f3f4
Expand Down

0 comments on commit 22a9f82

Please sign in to comment.