Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of the new stream feature from Wllama to simplify the code #980

Merged
merged 1 commit into from
Mar 14, 2025

Conversation

felladrin
Copy link
Owner

Changes Made

Removed the ChatCompletionOptions import since we're no longer using the onNewToken callback approach.

Added a standard JavaScript AbortController to handle interruption of the text generation process.

Implemented the new streaming API by:

  • Adding stream: true to the createChatCompletion options
  • Using abortSignal: abortController.signal for interruption control
  • Processing the AsyncIterator with a for await...of loop

Simplified the state management by setting the generating state once before starting the stream processing.

Benefits of This Approach

More efficient streaming: The new AsyncIterator approach is more efficient and follows modern JavaScript patterns.

Cleaner interruption handling: Using the standard AbortController pattern makes the code more maintainable and consistent with other JavaScript APIs.

Better resource management: The streaming approach can potentially reduce memory usage since we're processing chunks as they arrive.

Simplified code: Removed the callback-based approach in favor of a more straightforward async/await pattern.

Copy link
Contributor

Review

The changes in this PR are well-implemented and follow modern JavaScript patterns, making the code more maintainable and efficient. Here are a few minor points to consider:

Potential Improvements

  1. Error Handling in Async Iterator:
    • Line: for await (const chunk of stream) {
    • Issue: If an error occurs during the streaming process, it might not be caught by the existing try-catch block.
    • Suggestion: Ensure that the for await...of loop is wrapped in a try-catch block to handle any errors that might occur during streaming.

Example Fix

diff --git a/client/modules/textGenerationWithWllama.ts b/client/modules/textGenerationWithWllama.ts
index 4418f056..7f2b3e4a 100644
--- a/client/modules/textGenerationWithWllama.ts
+++ b/client/modules/textGenerationWithWllama.ts
@@ -105,29 +105,35 @@ async function generateWithWllama({
       },
     );
 
-    if (shouldCheckCanRespond && getTextGenerationState() !== "generating") {
-      updateTextGenerationState("generating");
-    }
+    try {
+      if (shouldCheckCanRespond && getTextGenerationState() !== "generating") {
+        updateTextGenerationState("generating");
+      }
 
-    for await (const chunk of stream) {
-      if (shouldCheckCanRespond && getTextGenerationState() === "interrupted") {
-        abortController.abort();
-        throw new ChatGenerationError("Chat generation interrupted");
-      }
+      for await (const chunk of stream) {
+        if (shouldCheckCanRespond && getTextGenerationState() === "interrupted") {
+          abortController.abort();
+          throw new ChatGenerationError("Chat generation interrupted");
+        }
 
-      streamedMessage = handleWllamaCompletion(
-        model,
-        chunk.currentText,
-        () => abortController.abort(),
-        onUpdate,
-      );
-    }
+        streamedMessage = handleWllamaCompletion(
+          model,
+          chunk.currentText,
+          () => abortController.abort(),
+          onUpdate,
+        );
+      }
+    } catch (error) {
+      addLogEntry(
+        `Error during text generation: ${error.message}`,
+        "error",
+      );
+      throw error;
+    }
 
     return streamedMessage;
   } catch (error) {
     addLogEntry(
       `Error during text generation: ${error.message}`,
       "error",
     );
     throw error;
   }
 }

Conclusion

Overall, the PR simplifies the code and improves its maintainability by adopting modern JavaScript patterns. The suggested improvement ensures that errors during the streaming process are properly handled.

@felladrin felladrin marked this pull request as ready for review March 14, 2025 09:16
@felladrin felladrin merged commit c21fbdd into main Mar 14, 2025
3 checks passed
@felladrin felladrin deleted the update-text-generation-with-wllama branch March 14, 2025 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant