[CANN] Adapt to dynamically loadable backends mechanism #9970

leo-pony · 2024-10-21T03:28:27Z

Dynamically loadable backends framework has been added in PR(ggml-backend : add device and backend reg interfaces #9707). Currently, most backends have implemented these interfaces. CANN backend needs to adapt to this mechanism.
The corresponding Feature Request PR is Feature Request: [CANN] backend adapts to llama.cpp dynamic backend loading mechanism #9862.
Fix the bug:"Inference running result is garbled in debug running model for LM models whose type is Q4_0 class". Issue number Bug: [CANN] inference running result is garbled in debug running model for LM models who's type is Q4_0 class #9979

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Function is normal:

Performance is the same with before:

…el for LM models who's type is Q4_0 class

slaren · 2024-10-21T13:17:27Z

src/llama.cpp

All the instances of GGML_USE_CANN should be removed from this file, there are still a few remaining.

Thanks for your review, I have removed all the instances of GGML_USE_CANN from llama.cpp.

slaren · 2024-10-21T13:17:45Z

ggml/src/ggml-backend.cpp

+
 #ifdef GGML_USE_AMX
        register_backend(ggml_backend_amx_reg());
 #endif

-        // TODO: kompute, cann
+#ifdef GGML_USE_CANN
+        register_backend(ggml_backend_cann_reg());
+#endif


Do not add extra lines, use the same format as the rest of the backends.

Thanks for your review, I have removed the blank lines and the formatting is consistent with the rest of the backends. Function is normal after removing.

slaren · 2024-10-21T13:19:58Z

ggml/include/ggml-cann.h

@@ -33,6 +33,9 @@ extern "C" {
 * @brief Maximum number of CANN devices supported.
 */
 #define GGML_CANN_MAX_DEVICES 16
+#define GGML_CANN_NAME "CANN"


I suspect that this macro was taken from the implementation in the CUDA backend, but the reason it exists there is because the same code is used to build the ROCm and MUSA backends, so the name of the backend may change depending on the build flags, but I don't think this is necessary in the CANN backend.

Thanks for your review, this macro is no longer exposed to llama. I have moved this macro to the cann implementation file.

hipudding · 2024-10-22T08:15:07Z

The review comment has been fixed. Looks good to me. Approved

* [CANN] Adapt to dynamically loadable backends mechanism * Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class * Handle the review comments of this pull request

* [CANN] Adapt to dynamically loadable backends mechanism * Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class * Handle the review comments of this pull request build passed

* [CANN] Adapt to dynamically loadable backends mechanism * Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class * Handle the review comments of this pull request

[CANN] Adapt to dynamically loadable backends mechanism

1657447

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Oct 21, 2024

Fix the Bug: inference running result is garbled in debug running mod…

4d4ae1c

…el for LM models who's type is Q4_0 class

slaren reviewed Oct 21, 2024

View reviewed changes

Handle the review comments of this pull request

abf5be4

leo-pony force-pushed the cann_adapt_backend_dynamically_loadable branch from 6d36f48 to abf5be4 Compare October 22, 2024 02:27

hipudding assigned leo-pony Oct 22, 2024

hipudding added the Ascend NPU issues specific to Ascend NPUs label Oct 22, 2024

hipudding linked an issue Oct 22, 2024 that may be closed by this pull request

Bug: [CANN] inference running result is garbled in debug running model for LM models who's type is Q4_0 class #9979

Closed

hipudding approved these changes Oct 22, 2024

View reviewed changes

hipudding merged commit 6b84473 into ggml-org:master Oct 22, 2024
53 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CANN] Adapt to dynamically loadable backends mechanism #9970

[CANN] Adapt to dynamically loadable backends mechanism #9970

leo-pony commented Oct 21, 2024 •

edited

Loading

slaren Oct 21, 2024 •

edited

Loading

leo-pony Oct 22, 2024

slaren Oct 21, 2024

leo-pony Oct 22, 2024

slaren Oct 21, 2024

leo-pony Oct 22, 2024

hipudding commented Oct 22, 2024

[CANN] Adapt to dynamically loadable backends mechanism #9970

[CANN] Adapt to dynamically loadable backends mechanism #9970

Conversation

leo-pony commented Oct 21, 2024 • edited Loading

slaren Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

leo-pony Oct 22, 2024

Choose a reason for hiding this comment

slaren Oct 21, 2024

Choose a reason for hiding this comment

leo-pony Oct 22, 2024

Choose a reason for hiding this comment

slaren Oct 21, 2024

Choose a reason for hiding this comment

leo-pony Oct 22, 2024

Choose a reason for hiding this comment

hipudding commented Oct 22, 2024

leo-pony commented Oct 21, 2024 •

edited

Loading

slaren Oct 21, 2024 •

edited

Loading