Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA: fix scratch buffers being allocated on non-main device #3220

Conversation

JohannesGaessler
Copy link
Collaborator

Fixes #3163 .

The issue from what I can tell is that ggml_cuda_assign_scratch_offset does not set the device to the main device before allocating memory. As a consequence it is possible that the scratch buffers end up on other devices which then later causes an error. This PR simply adds a call to ggml_cuda_set_device before any of the other CUDA calls.

ggml-cuda.cu Outdated
Comment on lines 6972 to 6974

ggml_cuda_set_device(g_main_device);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be moved inside the if block below? It seems to be the only CUDA op relevant here.

@JohannesGaessler JohannesGaessler force-pushed the cuda-fix-malloc-wrong-device branch from 70ffed6 to 391dab7 Compare September 17, 2023 10:23
@JohannesGaessler JohannesGaessler merged commit 578d8c8 into ggml-org:master Sep 17, 2023
pkrmf pushed a commit to morlockstudios-com/llama.cpp that referenced this pull request Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA illegal memory access | AWS
2 participants