Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reasoning models #143

Open
ultronozm opened this issue Jan 21, 2025 · 4 comments
Open

Reasoning models #143

ultronozm opened this issue Jan 21, 2025 · 4 comments

Comments

@ultronozm
Copy link
Contributor

Have you seen https://api-docs.deepseek.com/guides/reasoning_model? It streams two types of responses: text and reasoning. Would be interested in adding support, but it requires a bit of API revamping, so figured I'd raise it here first.

@ahyatt
Copy link
Owner

ahyatt commented Jan 22, 2025

I haven't seen that particular document, but Gemini's reasoning model does the same thing. It's something I've mentioned elsewhere, the need to return a variety of things at the same time. That ability would be useful for all sorts of things, from alternative outputs to things like image outputs. Tool use can also have both text and tool data as you mentioned in #139. I can imagine instead of outputting text, outputting a plist with meaningful keys. However, this is an incompatible change, so I'd like to think of a way to do this in a way that's both easy for clients but allows for this extra data somehow. If you have any ideas, I'd love to hear them!

@ultronozm
Copy link
Contributor Author

ultronozm commented Feb 3, 2025

How do you feel about something like:

(cl-defgeneric llm-chat-streaming (provider prompt partial-callback response-callback
                                            error-callback &key aux-callback)
  "Stream a response to PROMPT from PROVIDER.

(...)

  AUX-CALLBACK: an optional function that is called with a plist of any
    extra partial data.  For example, if the provider sees a
    \"reasoning_content\" field, it might call:
      (funcall aux-callback '(:reasoning \"some chain-of-thought\"))

(...)"
  (ignore provider prompt partial-callback response-callback error-callback aux-callback)
  (signal 'not-implemented nil))

This would be backwards-compatible, simple for typical uses, and extensible.

@ahyatt
Copy link
Owner

ahyatt commented Feb 4, 2025

I'd like to understand the user experience people want beforehand. Is it important to stream the thinking, or just to have it at the end? Having it at the end would give parity for other non-streamed text output such as images. I do agree that an extra callback is a nice backward-compatible way to do this in general, though.

@ultronozm
Copy link
Contributor Author

I would like to stream the thinking. In my personal package https://github.com/ultronozm/ai-org-chat.el, I would put the thinking in a :THINKING: drawer at the top of the LLM's response, where it would be filtered out in subsequent messages.

The point is to give immediate feedback about what's happening, while waiting for the main text to start streaming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants