Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with tools + streaming + OpenAI on #133 #139

Closed
ultronozm opened this issue Jan 13, 2025 · 8 comments
Closed

issue with tools + streaming + OpenAI on #133 #139

ultronozm opened this issue Jan 13, 2025 · 8 comments

Comments

@ultronozm
Copy link
Contributor

Evaluating

(let* ((provider (make-llm-openai
                  :key (exec-path-from-shell-getenv "OPENAI_KEY")
                  :chat-model "gpt-4")))
  (llm-tester-tool-use-streaming provider))

yields Debugger entered--Lisp error: (wrong-type-argument sequencep :null)

ahyatt added a commit that referenced this issue Jan 14, 2025
@ahyatt
Copy link
Owner

ahyatt commented Jan 14, 2025

Thanks for noticing this! It was indeed broken after the changing how we deserialize JSON.

@ahyatt ahyatt closed this as completed Jan 14, 2025
@ultronozm
Copy link
Contributor Author

Thanks, looks good.

Here's a further issue in the same direction (which I'd be happy to send as a new issue with the same topic, let me know): evaluating

(let* ((provider (make-llm-openai
                  :key (exec-path-from-shell-getenv "OPENAI_KEY")
                  :chat-model "gpt-4"))
       (add-fn (llm-make-tool-function
                :function (lambda (callback a b)
                            (let ((result (format "%s" (+ a b))))
                              (push (list :tool-call (cons (list a b) result)) results)
                              (funcall callback result)))
                :name "add"
                :description "Sums two numbers."
                :args '((:name "a" :description "A number." :type "integer" :required t)
                        (:name "b" :description "A number." :type "integer" :required t))
                :async t))
       (prompt (llm-make-chat-prompt
                (concat
                 "Tell a joke in ten words or less. "
                 "Then compute 2+3 and tell me what you got.")
                :tools (list add-fn))))
  (llm-chat-streaming provider prompt
                      (lambda (partial)) (lambda (final)) (lambda (err msg))))

yields a backtrace starting with Error running timer ‘plz--respond’: (wrong-type-argument sequencep t)

@ahyatt
Copy link
Owner

ahyatt commented Jan 15, 2025

The spec changed in the past week; we now instead of having :required, have :optional. I believe that is the cause of your issue.

@ultronozm
Copy link
Contributor Author

Thanks. I tried what you suggested. The issue seems to be more fundamental:

(let ((provider (make-llm-openai
                 :key (exec-path-from-shell-getenv "OPENAI_KEY")
                 :chat-model "gpt-4")))
  (llm-tester-chat-streaming provider))

=> Error running timer ‘plz--respond’: (wrong-type-argument arrayp t)

@ahyatt
Copy link
Owner

ahyatt commented Jan 16, 2025

Thanks, I can reproduce that. I'm reopening.

@ahyatt ahyatt reopened this Jan 16, 2025
ahyatt added a commit that referenced this issue Jan 16, 2025
@ahyatt ahyatt closed this as completed Jan 16, 2025
@ultronozm
Copy link
Contributor Author

ultronozm commented Jan 16, 2025

Thanks. I'll record here (if you don't mind) a further issue on the same topic. The issue occurs for "gpt-4" but apparently not for newer models such as "gpt-4o". Evaluating

(let* ((provider (make-llm-openai
                  :key (exec-path-from-shell-getenv "OPENAI_KEY")
                  :chat-model "gpt-4"))
       (results nil)
       (add-fn (llm-make-tool-function
                :function (lambda (callback a b)
                            (let ((result (format "%s" (+ a b))))
                              (push (list :tool-call (cons (list a b) result)) results)
                              (funcall callback result)))
                :name "add"
                :description "Sums two numbers."
                :args '((:name "a" :description "A number." :type "integer" :required t)
                        (:name "b" :description "A number." :type "integer" :required t))
                :async t))
       (prompt (llm-make-chat-prompt
                (concat
                 "Tell a joke in ten words or less. "
                 "Then, compute 2+3 using the provided tool.")
                :tools
                (list add-fn)))
       responses done)
  (push (list :provider (symbol-name (type-of provider))) results)
  (push (list :chat-model (llm-openai-chat-model provider)) results)
  (push (list :prompt (copy-sequence prompt)) results)
  (llm-chat-streaming
   provider prompt
   (lambda (partial)
     (push (list :partial partial) results))
   (lambda (final)
     (push (list :final final) results)
     (push (list :prompt-after (copy-sequence prompt)) results)
     (setq done t))
   (lambda (err msg)
     (push (list :error err msg) results)
     (setq done t)))
  (while (not done)
    (sleep-for 0.1))
  (setq results (nreverse results))
  (pp-display-expression results "*test*"))

yields, e.g.,

((:provider "llm-openai") (:chat-model "gpt-4")
 (:prompt
  #s(llm-chat-prompt nil nil
                     (#s(llm-chat-prompt-interaction user
                                                     "Tell a joke in ten words or less. Then, compute 2+3 using the provided tool."
                                                     nil))
                     (#s(llm-tool-function
                         #[(callback a b)
                           ((let ((result (format "%s" (+ a b))))
                              (setq results
                                    (cons
                                     (list :tool-call
                                           (cons (list a b) result))
                                     results))
                              (funcall callback result)))
                           nil]
                         "add" "Sums two numbers."
                         ((:name "a" :description "A number." :type
                                 "integer" :required t)
                          (:name "b" :description "A number." :type
                                 "integer" :required t))
                         t))
                     nil nil nil nil))
 (:partial "Why") (:partial "Why don") (:partial "Why don't")
 (:partial "Why don't scientists")
 (:partial "Why don't scientists trust")
 (:partial "Why don't scientists trust atoms")
 (:partial "Why don't scientists trust atoms?")
 (:partial "Why don't scientists trust atoms? They")
 (:partial "Why don't scientists trust atoms? They make")
 (:partial "Why don't scientists trust atoms? They make up")
 (:partial "Why don't scientists trust atoms? They make up everything")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\n")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow,")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add ")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add 2")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add 2 and")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add 2 and ")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add 2 and 3")
 (:partial
  "Why don't scientists trust atoms? They make up everything.\n\nNow, let's add 2 and 3.\n")
 (:tool-call ((2 3) . "5")) (:final (("add" . "5")))
 (:prompt-after
  #s(llm-chat-prompt nil nil
                     (#s(llm-chat-prompt-interaction user
                                                     "Tell a joke in ten words or less. Then, compute 2+3 using the provided tool."
                                                     nil)
                        #s(llm-chat-prompt-interaction assistant
                                                       (#s(llm-provider-utils-tool-use
                                                           "call_TcvKlZyGP1NndBE03XUwZAsM"
                                                           "add"
                                                           ((a . 2)
                                                            (b . 3))))
                                                       nil)
                        #s(llm-chat-prompt-interaction tool-results
                                                       nil
                                                       (#s(llm-chat-prompt-tool-result
                                                           "call_TcvKlZyGP1NndBE03XUwZAsM"
                                                           "add" "5"))))
                     (#s(llm-tool-function
                         #[(callback a b)
                           ((let ((result (format "%s" (+ a b))))
                              (setq results
                                    (cons
                                     (list :tool-call
                                           (cons (list a b) result))
                                     results))
                              (funcall callback result)))
                           nil]
                         "add" "Sums two numbers."
                         ((:name "a" :description "A number." :type
                                 "integer" :required t)
                          (:name "b" :description "A number." :type
                                 "integer" :required t))
                         t))
                     nil nil nil nil)))

The main issue is that the AI's partial text responses preceding the tool call are not logged in the conversation history (see :prompt-after). On a related note, there is no "final" text callback generated by llm.

@ahyatt
Copy link
Owner

ahyatt commented Jan 17, 2025

I think this last issue is not going to work - we either do a tool call or return text, not both. I'm thinking of adding more flexibility in the future to change this, because there's a few situation in which there are essentially multiple different kinds of outputs. But that's probably going to result in a breaking change, or at least a significant addition to the API.

@ultronozm
Copy link
Contributor Author

Gotcha, thanks for the explanation. I think it's something worth pursuing, even if it requires a breaking change. Also, it's not just about old API's like gpt4 -- for instance, the latest Sonnet also streams text before tool calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants