-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support text-to-speech (TTS) streaming functionality #177
Comments
OpenAI doesn't seem to have native streaming APIs for Audio https://platform.openai.com/docs/api-reference/audio Can you elaborate more on what you mean by streaming functionality for TTS & STT? |
From what I can tell, this might be a documentation oversight? https://help.openai.com/en/articles/8555505-tts-api mentions:
And people reports that it is indeed working for them: openai/openai-python#864. It also matches the iOS app conversational behavior if you tried that. I did a couple of postman requests which gave me a "Transfer-Encoding: chunked" header in the response so that might work out of the box without any specific "stream" key set to true. The rust library would need to expose that if that is true. I have not tried the transcriptions endpoint so cannot comment on that yet. I can do some further research and share back unless you are tackling it. |
Thank you for sharing additional information. The help article suggest that they have officially made it public and so I assume its safe to consider it as not an internal feature flag in their API. Of course I'm not tackling it - as I learned it from your comment. Given that, you're welcome to send a PR! In addition, having a working example for this would be very helpful for me to test and other folks to use. |
Upstream spec was updated after I had released v0.18.0. So perhaps it may have this or may not, but worth a look. |
Ok great, I should be able to push a PR tomorrow I think (for the speech_stream endpoint initially and can do a second PR for the transcribe endpoint after that?) |
Sounds like a good plan to me, thank you for offering to contribute! |
Just added a PR - I had a quick look for the STT use case - I don't think that streaming is actually supported for the transcription endpoint, looking at the OpenAI documentation and openai-python code. I will drop it from the scope for now and reduce the scope to TTS streaming. |
Thanks for adding this, looking forward to using this :) Do you know by any chance how to set |
@Boscop I am not familiar with There is actually an example of that behavior in |
Unclear on how to move forward at this point as feedback on pull request cannot be actioned. Marking this as won't fix for now - happy to restart the thread if the conditions change. |
Thank you for your contributions. I'll update contribution guidelines with minimum expectations including testing, documentation etc. for basic hygiene - it would fill the missing communication gap in the project. Its easier to take if it compiles it works philosophy in Rust, but as we found in PR for this, its not always the case. I'm sorry that you had a poor experience here, and I agree my last comment on PR was not actionable and I'm sorry about that. If you wish to, you're very welcome to continue, to get your work shipped I gave it another review and left a comment. From the options that you have listed I think (3) most appropriate. I hope you continue and I'd be happy to see your work get shipped. Thank you again for contributions! |
Updated guidelines: https://github.com/64bit/async-openai#contributing This issue falls outside the official docs API Reference and OpenAPI spec, and since you already worked on it before guidelines were in place you're welcome to get it shipped. Please feel free to reach out if you have any concerns. |
Problem
The current
speech(...)
andtranscribe(...)
functions as part of theAudio
implementation do not support a streaming mode.This is particularly useful for any real-time application to simulate interactivity.
Proposal
Implement a
speech_stream
andtranscribe_stream
mimicking thecreate_stream
functionality.Is someone already working on this? If not, I can give it a go.
The text was updated successfully, but these errors were encountered: