-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server: add support for "tool_calls" (MeetKai/functionary model) #5695
base: master
Are you sure you want to change the base?
Conversation
DemoGGUF model is downloaded from this link: https://huggingface.co/meetkai/functionary-small-v2.2-GGUF/tree/main I'm using functionary-small-v2.2.q4_0.gguf in this demo. Turn 1: User asks and assistant wants to call a toolClick to see
Response:
Turn 2: Function is called and return data to assistantClick to see
Response:
The final conversation should looks like this:
|
@ggerganov @phymbert Could you take a bit of time to give me some inputs regarding testing part? Thanks in advance! |
This is an interesting application, but keep in mind I consider it low priority to merge in the short term as it adds even more functionality to |
Thanks for the info. As I'm not expecting it to be merged very soon, the my work is already been done in a self-contained manner to prevent having conflicts with future reworks on server side. I'll keep this PR in draft state though, as some parts are still missing. Will come back to this when the server code become more stable. |
This is a super wonderful feature, exactly what I'm looking for! With no tool using, many other useful features would not be possible. Hoping the feature can merge soon! |
if (enable_tool_calls) { | ||
choices = llama_functionary::convert_response_to_oai_choices(content); | ||
} else { | ||
choices = streaming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the streaming mode not supported for tools_call ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it does not, because convert_response_to_oai_choices
can only parse a full-constructed response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code to throw an error with streaming mode is not yet implemented, but below I leave a // TODO: "enable_tool_calls" cannot be used with "stream" mode
using json = nlohmann::json; | ||
|
||
/** | ||
* A simple test program that allow testing functionary.hpp without using server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is a simple unit test, better to go with the ctest approach as in the root repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're right, it should be a ctest. Problem is that the ctest in root CMakeLists is for core library, but not for the examples.
I believe that I'll need to convert this file to ctest anyway, maybe the ctest will run along with behave
, I'll see what's the best approach when I have time to continue working with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the ci server workflow, we can plug a ctest target only for server. I will push you an example on a side branch
More and more function calling capable models becoming available: |
Since Hermes 2 Pro has chosen OpenAI compatible schemas for their training, an implementation on top of llama.cpp was pretty easy:
https://github.com/adrianliechti/llama/blob/main/pkg/adapter/hermesfn/adapter.go |
I understand this might not be the most appropriate forum for my observation, but while researching llama.cpp, I noticed that the server component of this repository seems to receive a lot of attention. It appears to me that this might be breaking the principle of isolation of concerns. Would @ggerganov consider extracting the server component to a separate repository? If this issue has already been raised, could you please direct me to it? Thank you. |
Hi, |
As you see one reference above function calling has been merged via another pull request |
Support "tool_calls" OAI-compatible via MeetKai/functionary model
Motivation
Following my research on !5588 , I tried to implement ability to use https://github.com/MeetKai/functionary
The idea is that user can use the same OAI "tool_calls" included in
/v1/chat/completions
to interact with the model. There will be a translation layer to convert OAI schema <==> prompt.Implementation
My implementation is self-contained inside
functionary.hpp
, with a simplefunctionary-test.cpp
which allow me to test it withoutmake server
.The current call stack looks like this (without tool_calls):
With tool_calls enabled:
Upon loading the model, the template stored inside model is read, and if it's functionary's template, the tool_calls will be enable automatically. No additional config is required.
For the demo, see in the comment section
Testing
For now, I have no idea how to test it on CI. These changes are needed:
functionary-test.cpp
in CI