[Feature Request] OpenAI Realtime API #5672

lloydzhou · 2024-10-15T06:22:17Z

🥰 需求描述

https://openai.com/index/introducing-the-realtime-api/

https://platform.openai.com/docs/api-reference/realtime

https://github.com/openai/openai-realtime-console/blob/main/readme/realtime-console-demo.png

🧐 解决方案

逻辑

realtime api，使用websocket接入
api本身内置了sessions, conversation等概念，session支持配置modalities, instructions, voice, input_audio_format, output_audio_format, turn_detection, input_audio_transcription, tools等，支持function call
支持input_audio_buffer.append以及input_audio_buffer.commit方式上传音频，再通过response.create开始生成结果（turn_detection如果开启，可以不用手动调用）
支持客户端发送conversation.item.create将上下文的内容直接添加到当前的conversation，如果是历史记录，需要设置status=completed
conversation.item.truncate支持打断输入
通过监听事件response.audio.delta拿到base64 audio data，通过response.text.delta同步拿到文本。
通过监听事件response.output_item.added拿到是否是function call, 通过监听response.function_call_arguments.delta拿到function call参数。或者直接在response.done里面拿function call相关信息？

交互

可能会新增OpenAI客户端一样的语音交互页面直接调用realtime api。
当前的语音交互界面，默认全屏，支持缩小到输入框大小（替换输入框位置）。同时保留语音输入界面以及chat history页面（保留这里，可以支持展示插件执行生成的中间结果等，例如中间调用插件生成一张图，语音是无法直接描述的）。
语音通话生成的结果（audio buffer）以及同时拿到的文本信息，需要持久化到sessions里面
语音通话支持选择voice，format，detection模式，tools等（这些按钮需要保留，或者在语音界面重新布局）

讨论

realtime是一个新的model，但是这个model明显和之前的model是不对等的。应该怎么放？
realtime api也支持modalities只填写text，会将语音给屏蔽掉（只是屏蔽语音，但还是支持一整套的通过websocket调用这个模型）。

📝 补充信息

价格

The text was updated successfully, but these errors were encountered:

Dogtiti · 2024-11-07T13:34:54Z

#5786

Issues-translate-bot · 2024-11-07T13:35:07Z

Bot detected the issue body's language is not English, translate it automatically.

#5786

Dogtiti · 2024-11-11T03:44:26Z

设置面板配置参数

Dogtiti · 2024-11-11T06:21:44Z

暂时不支持添加context内容以及chat history
https://community.openai.com/t/realtime-api-did-anybody-managed-to-provide-previous-conversation-transcript-history-while-keeping-audio-answers/968293

kitaev-chen · 2024-11-11T18:53:31Z

请问这个有免费模型可用吗？还没聊1分钟就0.1$了。

Issues-translate-bot · 2024-11-11T18:53:45Z

Bot detected the issue body's language is not English, translate it automatically.

Is there a free model available for this? It’s only 0.1$ after chatting for 1 minute.

dustookk · 2024-11-29T09:17:06Z

用了上述配置方式配置了自己的参数

无法启动realtime 麦克风一直为禁用状态，也无法启用

update:

  发现是因为azure的 deployment 前面多加了一个空格，改了以后在电脑上测试成功了。
  
  但是手机上还是没有成功， 抓包并未看到请求azure或者open的wss://协议

Issues-translate-bot · 2024-11-29T09:17:17Z

Bot detected the issue body's language is not English, translate it automatically.

Use the above configuration method to configure your own parameters.

Realtime cannot be started. The microphone is always disabled and cannot be enabled.

#5825

qq1456680570 · 2024-12-27T07:53:36Z

希望能自定义实时聊天的接口地址

Issues-translate-bot · 2024-12-27T07:53:47Z

Bot detected the issue body's language is not English, translate it automatically.

I hope to customize the interface address of real-time chat

jayjayhust · 2025-01-27T14:45:19Z

希望能自定义实时聊天的接口地址

是的，minimax也开放了realtime的接口，希望能够自定义接口地址，选择不同的realtime api服务：https://platform.minimaxi.com/document/Realtime?key=640e0c9c5f918b4f6c4e2d58

Issues-translate-bot · 2025-01-27T14:45:32Z

Bot detected the issue body's language is not English, translate it automatically.

I hope to customize the interface address of real-time chat

Yes, minimax has also opened the realtime interface. I hope to be able to customize the interface address and choose different realtime api services: https://platform.minimaxi.com/document/Realtime?key=640e0c9c5f918b4f6c4e2d58

lloydzhou added the enhancement New feature or request label Oct 15, 2024

lloydzhou assigned ElricLiu, Dogtiti and Leizhenpeng Oct 15, 2024

lloydzhou mentioned this issue Oct 16, 2024

[Feature Request] Realtime Voice API support #5593

Closed

Dogtiti mentioned this issue Oct 20, 2024

[Feature Request] #5688

Closed

coderabbitai bot mentioned this issue Nov 11, 2024

Feature/realtime chat #5786

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] OpenAI Realtime API #5672

[Feature Request] OpenAI Realtime API #5672

lloydzhou commented Oct 15, 2024

Dogtiti commented Nov 7, 2024

Issues-translate-bot commented Nov 7, 2024

Dogtiti commented Nov 11, 2024

Dogtiti commented Nov 11, 2024

kitaev-chen commented Nov 11, 2024

Issues-translate-bot commented Nov 11, 2024

dustookk commented Nov 29, 2024 •

edited

Loading

Issues-translate-bot commented Nov 29, 2024

qq1456680570 commented Dec 27, 2024

Issues-translate-bot commented Dec 27, 2024

jayjayhust commented Jan 27, 2025

Issues-translate-bot commented Jan 27, 2025

[Feature Request] OpenAI Realtime API #5672

[Feature Request] OpenAI Realtime API #5672

Comments

lloydzhou commented Oct 15, 2024

🥰 需求描述

🧐 解决方案

逻辑

交互

讨论

📝 补充信息

Dogtiti commented Nov 7, 2024

Issues-translate-bot commented Nov 7, 2024

Dogtiti commented Nov 11, 2024

Dogtiti commented Nov 11, 2024

kitaev-chen commented Nov 11, 2024

Issues-translate-bot commented Nov 11, 2024

dustookk commented Nov 29, 2024 • edited Loading

Issues-translate-bot commented Nov 29, 2024

qq1456680570 commented Dec 27, 2024

Issues-translate-bot commented Dec 27, 2024

jayjayhust commented Jan 27, 2025

Issues-translate-bot commented Jan 27, 2025

dustookk commented Nov 29, 2024 •

edited

Loading