Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc: remove backend handle from global map when freed #7517

Closed
wants to merge 1 commit into from

Conversation

chraac
Copy link

@chraac chraac commented May 24, 2024

Found an app crash occurring when trying to reload a new llm model after another model has been unloaded.

When built with ASan, it shows a heap-buffer-overflow error in ggml_backend_rpc_get_device_memory. The relevant log is as follows:

READ of size 8 at 0x60e00039c6f8 thread T78
    #0 0x7666861ff0b3 in ggml_backend_rpc_get_device_memory (libllama.so+0x1ff0b3)
    #1 0x76668633644e  (libllama.so+0x33644e)
    #2 0x7666863af017 in llama_load_model_from_file (libllama.so+0x3af017)
    ...

Upon investigating the source code, discovered that in ggml-rpc.cpp, the ggml_backend_t instances are stored in a map called instances. However, inside the ggml_backend_rpc_free function, we forgot to remove the ggml_backend_t item from the map when it is freed:

ggml-rpc.cpp#L531

GGML_CALL static void ggml_backend_rpc_free(ggml_backend_t backend) {
    ggml_backend_rpc_context * rpc_ctx = (ggml_backend_rpc_context *)backend->context;
    ggml_backend_rpc_buffer_type_context * buft_ctx = (ggml_backend_rpc_buffer_type_context *)rpc_ctx->buft->context;
    delete buft_ctx;
    delete rpc_ctx->buft;
    delete rpc_ctx;
    delete backend;   // delete the backend object without removing it from map.
}

Here we insert ggml_backend_t into map:
ggml-rpc.cpp#L679

GGML_CALL ggml_backend_t ggml_backend_rpc_init(const char * endpoint) {
    ...

    instances[endpoint] = new ggml_backend {
        /* .guid      = */ ggml_backend_rpc_guid(),
        /* .interface = */ ggml_backend_rpc_interface,
        /* .context   = */ ctx
    };

    return instances[endpoint];
}

@@ -531,6 +532,7 @@ GGML_CALL static const char * ggml_backend_rpc_name(ggml_backend_t backend) {
GGML_CALL static void ggml_backend_rpc_free(ggml_backend_t backend) {
ggml_backend_rpc_context * rpc_ctx = (ggml_backend_rpc_context *)backend->context;
ggml_backend_rpc_buffer_type_context * buft_ctx = (ggml_backend_rpc_buffer_type_context *)rpc_ctx->buft->context;
instances.erase(rpc_ctx->endpoint);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we lock the map when changing its content? Will it be accessed from multi thread?

@mofosyne mofosyne added Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server labels May 24, 2024
@chraac chraac changed the title rpc: remove backend handle from global map when backend got freed rpc: remove backend handle from global map when freed May 24, 2024
@rgerganov
Copy link
Collaborator

Unfortunately this doesn't solve the problem in all cases. For example in llama-bench the backend is still used after calling ggml_backend_rpc_free. The proper behavior is to deallocate the backend when all buffers are freed and ggml_backend_rpc_free is called (see slaren's comment in PR #7435). However, it is not guaranteed that ggml_backend_rpc_free will be called last.

I am working on a fix and will submit a PR shortly.

@chraac
Copy link
Author

chraac commented May 27, 2024

Unfortunately this doesn't solve the problem in all cases. For example in llama-bench the backend is still used after calling ggml_backend_rpc_free. The proper behavior is to deallocate the backend when all buffers are freed and ggml_backend_rpc_free is called (see slaren's comment in PR #7435). However, it is not guaranteed that ggml_backend_rpc_free will be called last.

I am working on a fix and will submit a PR shortly.

so there's another place that keep the ggml_backend_t after it was freed? for such case, it would be better to use the shared_ptr and weak_ptr.

@rgerganov
Copy link
Collaborator

please take a look at PR #7562

@chraac chraac closed this May 27, 2024
@chraac chraac deleted the dev-add-asan branch May 27, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants