Make use of the new `stream` feature from Wllama to simplify the code #980

felladrin · 2025-03-14T09:07:08Z

Changes Made

Removed the ChatCompletionOptions import since we're no longer using the onNewToken callback approach.

Added a standard JavaScript AbortController to handle interruption of the text generation process.

Implemented the new streaming API by:

Adding stream: true to the createChatCompletion options
Using abortSignal: abortController.signal for interruption control
Processing the AsyncIterator with a for await...of loop

Simplified the state management by setting the generating state once before starting the stream processing.

Benefits of This Approach

More efficient streaming: The new AsyncIterator approach is more efficient and follows modern JavaScript patterns.

Cleaner interruption handling: Using the standard AbortController pattern makes the code more maintainable and consistent with other JavaScript APIs.

Better resource management: The streaming approach can potentially reduce memory usage since we're processing chunks as they arrive.

Simplified code: Removed the callback-based approach in favor of a more straightforward async/await pattern.

github-actions · 2025-03-14T09:07:36Z

Review

The changes in this PR are well-implemented and follow modern JavaScript patterns, making the code more maintainable and efficient. Here are a few minor points to consider:

Potential Improvements

Error Handling in Async Iterator:
- Line: for await (const chunk of stream) {
- Issue: If an error occurs during the streaming process, it might not be caught by the existing try-catch block.
- Suggestion: Ensure that the for await...of loop is wrapped in a try-catch block to handle any errors that might occur during streaming.

Example Fix

diff --git a/client/modules/textGenerationWithWllama.ts b/client/modules/textGenerationWithWllama.ts
index 4418f056..7f2b3e4a 100644
--- a/client/modules/textGenerationWithWllama.ts
+++ b/client/modules/textGenerationWithWllama.ts
@@ -105,29 +105,35 @@ async function generateWithWllama({
       },
     );
 
-    if (shouldCheckCanRespond && getTextGenerationState() !== "generating") {
-      updateTextGenerationState("generating");
-    }
+    try {
+      if (shouldCheckCanRespond && getTextGenerationState() !== "generating") {
+        updateTextGenerationState("generating");
+      }
 
-    for await (const chunk of stream) {
-      if (shouldCheckCanRespond && getTextGenerationState() === "interrupted") {
-        abortController.abort();
-        throw new ChatGenerationError("Chat generation interrupted");
-      }
+      for await (const chunk of stream) {
+        if (shouldCheckCanRespond && getTextGenerationState() === "interrupted") {
+          abortController.abort();
+          throw new ChatGenerationError("Chat generation interrupted");
+        }
 
-      streamedMessage = handleWllamaCompletion(
-        model,
-        chunk.currentText,
-        () => abortController.abort(),
-        onUpdate,
-      );
-    }
+        streamedMessage = handleWllamaCompletion(
+          model,
+          chunk.currentText,
+          () => abortController.abort(),
+          onUpdate,
+        );
+      }
+    } catch (error) {
+      addLogEntry(
+        `Error during text generation: ${error.message}`,
+        "error",
+      );
+      throw error;
+    }
 
     return streamedMessage;
   } catch (error) {
     addLogEntry(
       `Error during text generation: ${error.message}`,
       "error",
     );
     throw error;
   }
 }

Conclusion

Overall, the PR simplifies the code and improves its maintainability by adopting modern JavaScript patterns. The suggested improvement ensures that errors during the streaming process are properly handled.

Make use of the new stream feature from Wllama to simplify the code

4778525

felladrin marked this pull request as ready for review March 14, 2025 09:16

felladrin merged commit c21fbdd into main Mar 14, 2025
3 checks passed

felladrin deleted the update-text-generation-with-wllama branch March 14, 2025 09:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make use of the new `stream` feature from Wllama to simplify the code #980

Make use of the new `stream` feature from Wllama to simplify the code #980

felladrin commented Mar 14, 2025

github-actions bot commented Mar 14, 2025

Make use of the new stream feature from Wllama to simplify the code #980

Make use of the new stream feature from Wllama to simplify the code #980

Conversation

felladrin commented Mar 14, 2025

Changes Made

Benefits of This Approach

github-actions bot commented Mar 14, 2025

Review

Potential Improvements

Example Fix

Conclusion

Make use of the new `stream` feature from Wllama to simplify the code #980

Make use of the new `stream` feature from Wllama to simplify the code #980