Reasoning models #143

ultronozm · 2025-01-21T22:33:50Z

Have you seen https://api-docs.deepseek.com/guides/reasoning_model? It streams two types of responses: text and reasoning. Would be interested in adding support, but it requires a bit of API revamping, so figured I'd raise it here first.

ahyatt · 2025-01-22T05:27:14Z

I haven't seen that particular document, but Gemini's reasoning model does the same thing. It's something I've mentioned elsewhere, the need to return a variety of things at the same time. That ability would be useful for all sorts of things, from alternative outputs to things like image outputs. Tool use can also have both text and tool data as you mentioned in #139. I can imagine instead of outputting text, outputting a plist with meaningful keys. However, this is an incompatible change, so I'd like to think of a way to do this in a way that's both easy for clients but allows for this extra data somehow. If you have any ideas, I'd love to hear them!

ultronozm · 2025-02-03T21:38:55Z

How do you feel about something like:

(cl-defgeneric llm-chat-streaming (provider prompt partial-callback response-callback
                                            error-callback &key aux-callback)
  "Stream a response to PROMPT from PROVIDER.

(...)

  AUX-CALLBACK: an optional function that is called with a plist of any
    extra partial data.  For example, if the provider sees a
    \"reasoning_content\" field, it might call:
      (funcall aux-callback '(:reasoning \"some chain-of-thought\"))

(...)"
  (ignore provider prompt partial-callback response-callback error-callback aux-callback)
  (signal 'not-implemented nil))

This would be backwards-compatible, simple for typical uses, and extensible.

ahyatt · 2025-02-04T05:09:21Z

I'd like to understand the user experience people want beforehand. Is it important to stream the thinking, or just to have it at the end? Having it at the end would give parity for other non-streamed text output such as images. I do agree that an extra callback is a nice backward-compatible way to do this in general, though.

ultronozm · 2025-02-04T05:18:04Z

I would like to stream the thinking. In my personal package https://github.com/ultronozm/ai-org-chat.el, I would put the thinking in a :THINKING: drawer at the top of the LLM's response, where it would be filtered out in subsequent messages.

The point is to give immediate feedback about what's happening, while waiting for the main text to start streaming.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reasoning models #143

Reasoning models #143

ultronozm commented Jan 21, 2025

ahyatt commented Jan 22, 2025

ultronozm commented Feb 3, 2025 •

edited

Loading

ahyatt commented Feb 4, 2025

ultronozm commented Feb 4, 2025

Reasoning models #143

Reasoning models #143

Comments

ultronozm commented Jan 21, 2025

ahyatt commented Jan 22, 2025

ultronozm commented Feb 3, 2025 • edited Loading

ahyatt commented Feb 4, 2025

ultronozm commented Feb 4, 2025

ultronozm commented Feb 3, 2025 •

edited

Loading