ggml-backend : add device and backend reg interfaces #9707

slaren · 2024-10-01T15:26:41Z

Adds the backend device and backend registry interfaces. These interfaces represent an entry point to the backend, and aim to replace commonly used custom backend functions and pave the way to support dynamically loadable backends.

The backend registry interface provides a way to enumerate the devices exposed by the backend, obtain function pointers to custom backend functions, and other functionality that is common to the entire backend.

The backend device interface has functions to create backend instances and query information about the devices. Some of the functions of the backend interface have been moved to the device interface.

Currently, only the CUDA and CPU backends implement these interfaces, and support in other backends will be added progressively. During the transition period, currently existing backends that do not implement these interfaces can still be used, but eventually llama.cpp will be refactored to use the backend registry API only. Most backends already implement the functions in these interfaces, so this should only require shuffling some code around. test-backend-ops will stop working for backends that do not implement these interfaces.

Other changes:

Removes the GGML_CALL macro: this was added to support llamafile, but is never used within ggml. As a result, it is very hard to maintain because we don't know which functions need it, and it keeps creeping to new functions in a very inconsistent manner. Once support for loading backends dynamically is added to ggml, other projects can use this implementation rather than rolling their own.

ggml-ci

JohannesGaessler · 2024-10-01T15:34:19Z

ggml/include/ggml-backend.h

-    GGML_API void                   ggml_backend_event_wait       (ggml_backend_t backend, ggml_backend_event_t event);
+    GGML_API ggml_backend_event_t ggml_backend_event_new        (ggml_backend_dev_t device);
+    GGML_API void                 ggml_backend_event_free       (ggml_backend_event_t event);
+    GGML_API void                 ggml_backend_event_record     (ggml_backend_event_t event, ggml_backend_t backend);


Why is it necessary to pass a backend? Is ggml_backend_dev_t not backend-specific?

ggml_backend_t represents a stream or async queue. The events are associated with a device, but not a stream. ggml_backend_event_record records the event on the stream represented by backend, which should be a backend (stream) of the same device than the event. I know that this is a bit confusing at the moment, ggml_backend_t should be renamed to something like ggml_backend_stream, but I am afraid that it will break a lot of code.

ggml/include/ggml-backend.h

JohannesGaessler · 2024-10-01T15:56:16Z

ggml/include/ggml-backend.h

+
+    GGML_API ggml_backend_t ggml_backend_cpu_init(void);
+
+    GGML_API bool ggml_backend_is_cpu                (ggml_backend_t backend);


Is the long-term plan to make this check against a device instead of a backend?

I don't intend to change these functions at the moment. Most of the functions that need these checks, like ggml_backend_cpu_set_n_threads, operate on a ggml_backend_t object, so it is still convenient to have a function to check if a ggml_backend_t belongs to a specific backend. After all the backends have been adapted to the new interface this could be re-evaluated.

JohannesGaessler · 2024-10-01T16:01:47Z

ggml/src/ggml-backend-impl.h

+        // (optional) tensor copy: dst is in the buffer, src may be in any buffer, including buffers from a different backend (return false if not supported)
+        bool         (*cpy_tensor)   (ggml_backend_buffer_t buffer, const struct ggml_tensor * src, struct ggml_tensor * dst);
+        // clear the entire buffer
+        void         (*clear)        (ggml_backend_buffer_t buffer, uint8_t value);


This is in essence the same functionality as memset_tensor except at a different scope, should we be using the same name?

The reason I didn't want to call this function memset when it was added is because it does not allow specifying neither the offset or the amount of memory to clear, it always applies to the entire buffer. I believe that the name clear makes it a bit more intuitive that the function applies to the entire buffer and is not as flexible as memset. memset_tensor is fine since it effectively provides the full functionality of a memset function, although limited to tensors. Anyway I may be overthinking this, it is a rather minor distinction.

ggml/src/ggml-backend-impl.h

ggml/include/ggml-backend.h

JohannesGaessler · 2024-10-01T16:13:53Z

ggml/src/ggml-backend.cpp

+    ggml_backend_registry() {
+#ifdef GGML_USE_CUDA
+        register_backend(ggml_backend_cuda_reg());
+#endif

-    return ggml_backend_registry_count;
-}
+        register_backend(ggml_backend_cpu_reg());

-size_t ggml_backend_reg_find_by_name(const char * name) {
-    ggml_backend_registry_init();
+        // TODO: sycl, metal, vulkan, kompute, cann
+    }


Is there a meaning behind the order of backends, e.g. the priority with which they are used?

Functions like ggml_backend_dev_by_type choose the first device of the given type, so the order can make a difference.

Co-authored-by: Johannes Gäßler <[email protected]>

…backends

ggml-ci

ggml/src/ggml-backend.cpp

ggerganov

I've started adapting the Metal backend to the new interfaces and everything is working out smoothly. Feel free to merge this PR at any point and in the meantime I will continue the Metal implementation in #9713.

LostRuins · 2024-10-06T09:33:02Z

Hi,
buffer type %s is not the default buffer type for device %s for async uploads

I'm a little confused by this error message, what exactly does it mean?

slaren · 2024-10-06T12:24:33Z

It's mostly to prevent host buffers from being used with the incorrect backend. It is not an error, it is a debug message meant to help developers understand why async uploads is not being used, in llama.cpp it shouldn't be printed unless run with --verbose.

Co-authored-by: Johannes Gäßler <[email protected]>

ggml-backend : add device and backend reg interfaces

0cbdf13

ggml-ci

slaren mentioned this pull request Oct 1, 2024

[SYCL] Add SYCL Backend registry, device and Event Interfaces #9705

Merged

4 tasks

JohannesGaessler reviewed Oct 1, 2024

View reviewed changes

slaren and others added 2 commits October 1, 2024 18:52

Update ggml/src/ggml-backend-impl.h

805fea9

Co-authored-by: Johannes Gäßler <[email protected]>

add device props/caps, fully support async upload for all compatible …

6ff0e7a

…backends

github-actions bot added the Kompute https://github.com/KomputeProject/kompute/ label Oct 1, 2024

slaren force-pushed the sl/backend-registry-2 branch 4 times, most recently from e7a6deb to 9ade7ce Compare October 2, 2024 00:45

update other backends

04ef648

ggml-ci

slaren force-pushed the sl/backend-registry-2 branch from 9ade7ce to 04ef648 Compare October 2, 2024 00:45

github-actions bot added script Script related devops improvements to build systems and github actions labels Oct 2, 2024

fix pipeline parallelism check

db53f8e

slaren marked this pull request as ready for review October 2, 2024 02:01

ggerganov reviewed Oct 2, 2024

View reviewed changes

ggml/src/ggml-backend.cpp Outdated Show resolved Hide resolved

ggerganov reviewed Oct 2, 2024

View reviewed changes

ggml/src/ggml-backend.cpp Outdated Show resolved Hide resolved

ggerganov mentioned this pull request Oct 2, 2024

ggml : add metal backend registry / device #9713

Merged

4 tasks

ggerganov approved these changes Oct 2, 2024

View reviewed changes

slaren added 4 commits October 2, 2024 14:55

removed unused function, add missing statics

f9cab02

Merge remote-tracking branch 'origin/master' into sl/backend-registry-2

2a60833

remove move unused reg_init functions from backends

6ff0d67

fix align [no ci]

b5516aa

slaren added 5 commits October 2, 2024 20:40

fix consistency issues with the usage of main_gpu

dc475c3

move device backend_reg to the struct

d0c4954

fix some inconsistencies in the names of functions

cfef355

fix more naming inconsistencies, make interface structs const

ffeca35

Merge remote-tracking branch 'origin/master' into sl/backend-registry-2

a9d172c

slaren merged commit c83ad6d into master Oct 2, 2024
53 checks passed

bandoti mentioned this pull request Oct 3, 2024

ggml: unify backend logging mechanism #9709

Merged

4 tasks

leo-pony mentioned this pull request Oct 12, 2024

Feature Request: [CANN] backend adapts to llama.cpp dynamic backend loading mechanism #9862

Closed

4 tasks

leo-pony mentioned this pull request Oct 21, 2024

[CANN] Adapt to dynamically loadable backends mechanism #9970

Merged

4 tasks

slaren deleted the sl/backend-registry-2 branch October 29, 2024 11:18

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024

ggml-backend : add device and backend reg interfaces (ggml-org#9707)

7804ca4

Co-authored-by: Johannes Gäßler <[email protected]>

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

ggml-backend : add device and backend reg interfaces (ggml-org#9707)

0f92929

Co-authored-by: Johannes Gäßler <[email protected]>

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

ggml-backend : add device and backend reg interfaces (ggml-org#9707)

230f88a

Co-authored-by: Johannes Gäßler <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-backend : add device and backend reg interfaces #9707

ggml-backend : add device and backend reg interfaces #9707

slaren commented Oct 1, 2024

JohannesGaessler Oct 1, 2024

slaren Oct 1, 2024 •

edited

Loading

JohannesGaessler Oct 1, 2024

slaren Oct 1, 2024

JohannesGaessler Oct 1, 2024

slaren Oct 1, 2024

JohannesGaessler Oct 1, 2024

slaren Oct 1, 2024 •

edited

Loading

ggerganov left a comment

LostRuins commented Oct 6, 2024

slaren commented Oct 6, 2024 •

edited

Loading


		GGML_API ggml_backend_t ggml_backend_cpu_init(void);

		GGML_API bool ggml_backend_is_cpu (ggml_backend_t backend);

ggml-backend : add device and backend reg interfaces #9707

ggml-backend : add device and backend reg interfaces #9707

Conversation

slaren commented Oct 1, 2024

JohannesGaessler Oct 1, 2024

Choose a reason for hiding this comment

slaren Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

JohannesGaessler Oct 1, 2024

Choose a reason for hiding this comment

slaren Oct 1, 2024

Choose a reason for hiding this comment

JohannesGaessler Oct 1, 2024

Choose a reason for hiding this comment

slaren Oct 1, 2024

Choose a reason for hiding this comment

JohannesGaessler Oct 1, 2024

Choose a reason for hiding this comment

slaren Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

ggerganov left a comment

Choose a reason for hiding this comment

LostRuins commented Oct 6, 2024

slaren commented Oct 6, 2024 • edited Loading

slaren Oct 1, 2024 •

edited

Loading

slaren Oct 1, 2024 •

edited

Loading

slaren commented Oct 6, 2024 •

edited

Loading