We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When trying to run the kernel on inputs of GPU ID of non-zero. E.g. 1,2,3,4,5,6,7. It will throw the following error.
Memory access fault by GPU node-2 (Agent handle: 0x9b15d70) on address 0x7ee42d200000. Reason: Unknown. tensor(False, device='cuda:1') GPU core dump created: gpucore.10171 Aborted root@tw024:/app# python ex.py Memory access fault by GPU node-2 (Agent handle: 0xa5f71a0) on address 0x7f532b800000. Reason: Unknown. GPU core dump created: gpucore.10255 Aborted
Ubuntu 22.04.4 LTS (Jammy Jellyfish)
AMD EPYC 9654 96-Core Processor
AMD Instinct MI300X
ROCm 6.3.1
composable_kernel
from aiter.ops.gemm_op_a8w8 import gemm_a8w8_CK import torch SIZE_LIST = [ (3840, 16384, 16384), (56, 8192, 7392) ] def main(): for size in SIZE_LIST: M, N, K = size A = torch.rand(size=(M, K), device="cuda:1").to(torch.int8) B = torch.rand(size=(K, N), device="cuda:1").to(torch.int8) scale_a = torch.ones((M, 1), device="cuda:1").to(torch.int32) scale_b = torch.ones((N, 1), device="cuda:1").to(torch.int32) result = gemm_a8w8_CK(A, B.t(), scale_a, scale_b, dtype=torch.bfloat16) if __name__ == "__main__": main()
No response
The text was updated successfully, but these errors were encountered:
A workaround for now is to call torch.cuda.set_device("cuda:1") before calling gemm_a8w8_CK.
torch.cuda.set_device("cuda:1")
gemm_a8w8_CK
It seems a proper fix would be to add device guards as in here.
Sorry, something went wrong.
A workaround for now is to call torch.cuda.set_device("cuda:1") before calling gemm_a8w8_CK. It seems a proper fix would be to add device guards as in here.
yes, this is the way i planed to fix it... thanks you did it
Thank you. Let us test these fixes on our end as well.
It works seemlessly now.
junhaha666
No branches or pull requests
Problem Description
When trying to run the kernel on inputs of GPU ID of non-zero. E.g. 1,2,3,4,5,6,7. It will throw the following error.
Operating System
Ubuntu 22.04.4 LTS (Jammy Jellyfish)
CPU
AMD EPYC 9654 96-Core Processor
GPU
AMD Instinct MI300X
ROCm Version
ROCm 6.3.1
ROCm Component
composable_kernel
Steps to Reproduce
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: