Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure] Support Realtime API #1287

Open
wants to merge 2 commits into
base: next
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -499,7 +499,7 @@ const credential = new DefaultAzureCredential();
const scope = 'https://cognitiveservices.azure.com/.default';
const azureADTokenProvider = getBearerTokenProvider(credential, scope);

const openai = new AzureOpenAI({ azureADTokenProvider });
const openai = new AzureOpenAI({ azureADTokenProvider, apiVersion: "<The API version, e.g. 2024-10-01-preview>" });

const result = await openai.chat.completions.create({
model: 'gpt-4o',
Expand All @@ -509,6 +509,26 @@ const result = await openai.chat.completions.create({
console.log(result.choices[0]!.message?.content);
```

### Realtime API
This SDK provides real-time streaming capabilities for Azure OpenAI through the `OpenAIRealtimeWS` and `OpenAIRealtimeWebSocket` clients described previously.

To utilize the real-time features, begin by creating a fully configured `AzureOpenAI` client and passing it into either `OpenAIRealtimeWS.azure` or `OpenAIRealtimeWebSocket.azure`. For example:

```ts
const cred = new DefaultAzureCredential();
const scope = 'https://cognitiveservices.azure.com/.default';
const deploymentName = 'gpt-4o-realtime-preview-1001';
const azureADTokenProvider = getBearerTokenProvider(cred, scope);
const client = new AzureOpenAI({
azureADTokenProvider,
apiVersion: '2024-10-01-preview',
deployment: deploymentName,
});
const rt = await OpenAIRealtimeWS.azure(client);
```

Once the instance has been created, you can then begin sending requests and receiving streaming responses in real time.

### Retries

Certain errors will be automatically retried 2 times by default, with a short exponential backoff.
Expand Down
3 changes: 2 additions & 1 deletion examples/azure.ts → examples/azure/chat.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import { AzureOpenAI } from 'openai';
import { getBearerTokenProvider, DefaultAzureCredential } from '@azure/identity';
import 'dotenv/config';

// Corresponds to your Model deployment within your OpenAI resource, e.g. gpt-4-1106-preview
// Navigate to the Azure OpenAI Studio to deploy a model.
Expand All @@ -13,7 +14,7 @@ const azureADTokenProvider = getBearerTokenProvider(credential, scope);

// Make sure to set AZURE_OPENAI_ENDPOINT with the endpoint of your Azure resource.
// You can find it in the Azure Portal.
const openai = new AzureOpenAI({ azureADTokenProvider });
const openai = new AzureOpenAI({ azureADTokenProvider, apiVersion: '2024-10-01-preview' });

async function main() {
console.log('Non-streaming:');
Expand Down
60 changes: 60 additions & 0 deletions examples/azure/realtime/websocket.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import { OpenAIRealtimeWebSocket } from 'openai/beta/realtime/websocket';
import { AzureOpenAI } from 'openai';
import { DefaultAzureCredential, getBearerTokenProvider } from '@azure/identity';
import 'dotenv/config';

async function main() {
const cred = new DefaultAzureCredential();
const scope = 'https://cognitiveservices.azure.com/.default';
const deploymentName = 'gpt-4o-realtime-preview-1001';
const azureADTokenProvider = getBearerTokenProvider(cred, scope);
const client = new AzureOpenAI({
azureADTokenProvider,
apiVersion: '2024-10-01-preview',
deployment: deploymentName,
});
const rt = await OpenAIRealtimeWebSocket.azure(client);

// access the underlying `ws.WebSocket` instance
rt.socket.addEventListener('open', () => {
console.log('Connection opened!');
rt.send({
type: 'session.update',
session: {
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});

rt.send({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],
},
});

rt.send({ type: 'response.create' });
});

rt.on('error', (err) => {
// in a real world scenario this should be logged somewhere as you
// likely want to continue procesing events regardless of any errors
throw err;
});

rt.on('session.created', (event) => {
console.log('session created!', event.session);
console.log();
});

rt.on('response.text.delta', (event) => process.stdout.write(event.delta));
rt.on('response.text.done', () => console.log());

rt.on('response.done', () => rt.close());

rt.socket.addEventListener('close', () => console.log('\nConnection closed!'));
}

main();
67 changes: 67 additions & 0 deletions examples/azure/realtime/ws.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
import { DefaultAzureCredential, getBearerTokenProvider } from '@azure/identity';
import { OpenAIRealtimeWS } from 'openai/beta/realtime/ws';
import { AzureOpenAI } from 'openai';
import 'dotenv/config';

async function main() {
const cred = new DefaultAzureCredential();
const scope = 'https://cognitiveservices.azure.com/.default';
const deploymentName = 'gpt-4o-realtime-preview-1001';
const azureADTokenProvider = getBearerTokenProvider(cred, scope);
const client = new AzureOpenAI({
azureADTokenProvider,
apiVersion: '2024-10-01-preview',
deployment: deploymentName,
});
const rt = await OpenAIRealtimeWS.azure(client);

// access the underlying `ws.WebSocket` instance
rt.socket.on('open', () => {
console.log('Connection opened!');
rt.send({
type: 'session.update',
session: {
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});
rt.send({
type: 'session.update',
session: {
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});

rt.send({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],
},
});

rt.send({ type: 'response.create' });
});

rt.on('error', (err) => {
// in a real world scenario this should be logged somewhere as you
// likely want to continue procesing events regardless of any errors
throw err;
});

rt.on('session.created', (event) => {
console.log('session created!', event.session);
console.log();
});

rt.on('response.text.delta', (event) => process.stdout.write(event.delta));
rt.on('response.text.done', () => console.log());

rt.on('response.done', () => rt.close());

rt.socket.on('close', () => console.log('\nConnection closed!'));
}

main();
1 change: 1 addition & 0 deletions examples/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
"private": true,
"dependencies": {
"@azure/identity": "^4.2.0",
"dotenv": "^16.4.7",
"express": "^4.18.2",
"next": "^14.1.1",
"openai": "file:..",
Expand Down
2 changes: 1 addition & 1 deletion examples/realtime/ws.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ async function main() {
rt.send({
type: 'session.update',
session: {
modalities: ['foo'] as any,
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});
Expand Down
18 changes: 14 additions & 4 deletions src/beta/realtime/internal-base.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import { RealtimeClientEvent, RealtimeServerEvent, ErrorEvent } from '../../resources/beta/realtime/realtime';
import { EventEmitter } from '../../lib/EventEmitter';
import { OpenAIError } from '../../error';
import OpenAI, { AzureOpenAI } from 'openai';

export class OpenAIRealtimeError extends OpenAIError {
/**
Expand Down Expand Up @@ -73,11 +74,20 @@ export abstract class OpenAIRealtimeEmitter extends EventEmitter<RealtimeEvents>
}
}

export function buildRealtimeURL(props: { baseURL: string; model: string }): URL {
const path = '/realtime';
export function isAzure(client: Pick<OpenAI, 'apiKey' | 'baseURL'>): client is AzureOpenAI {
return client instanceof AzureOpenAI;
}

const url = new URL(props.baseURL + (props.baseURL.endsWith('/') ? path.slice(1) : path));
export function buildRealtimeURL(client: Pick<OpenAI, 'apiKey' | 'baseURL'>, model: string): URL {
const path = '/realtime';
const baseURL = client.baseURL;
const url = new URL(baseURL + (baseURL.endsWith('/') ? path.slice(1) : path));
url.protocol = 'wss';
url.searchParams.set('model', props.model);
if (isAzure(client)) {
url.searchParams.set('api-version', client.apiVersion);
url.searchParams.set('deployment', model);
} else {
url.searchParams.set('model', model);
}
return url;
}
48 changes: 44 additions & 4 deletions src/beta/realtime/websocket.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import { OpenAI } from '../../index';
import { AzureOpenAI, OpenAI } from '../../index';
import { OpenAIError } from '../../error';
import * as Core from '../../core';
import type { RealtimeClientEvent, RealtimeServerEvent } from '../../resources/beta/realtime/realtime';
import { OpenAIRealtimeEmitter, buildRealtimeURL } from './internal-base';
import { OpenAIRealtimeEmitter, buildRealtimeURL, isAzure } from './internal-base';

interface MessageEvent {
data: string;
Expand All @@ -26,6 +26,7 @@ export class OpenAIRealtimeWebSocket extends OpenAIRealtimeEmitter {
props: {
model: string;
dangerouslyAllowBrowser?: boolean;
onUrl?: (url: URL) => void;
},
client?: Pick<OpenAI, 'apiKey' | 'baseURL'>,
) {
Expand All @@ -44,11 +45,15 @@ export class OpenAIRealtimeWebSocket extends OpenAIRealtimeEmitter {

client ??= new OpenAI({ dangerouslyAllowBrowser });

this.url = buildRealtimeURL({ baseURL: client.baseURL, model: props.model });
this.url = buildRealtimeURL(client, props.model);
props.onUrl?.(this.url);

const azureCheck = isAzure(client);

// @ts-ignore
this.socket = new WebSocket(this.url, [
'realtime',
`openai-insecure-api-key.${client.apiKey}`,
...(azureCheck ? [] : [`openai-insecure-api-key.${client.apiKey}`]),
'openai-beta.realtime-v1',
]);

Expand Down Expand Up @@ -77,6 +82,41 @@ export class OpenAIRealtimeWebSocket extends OpenAIRealtimeEmitter {
this.socket.addEventListener('error', (event: any) => {
this._onError(null, event.message, null);
});

if (azureCheck) {
if (this.url.searchParams.get('Authorization') !== null) {
this.url.searchParams.set('Authorization', '<REDACTED>');
} else {
this.url.searchParams.set('api-key', '<REDACTED>');
}
}
}

static async azure(
client: AzureOpenAI,
options: { deploymentName?: string; dangerouslyAllowBrowser?: boolean } = {},
): Promise<OpenAIRealtimeWebSocket> {
const token = await client._getAzureADToken();
function onUrl(url: URL) {
if (client.apiKey !== '<Missing Key>') {
url.searchParams.set('api-key', client.apiKey);
} else {
if (token) {
url.searchParams.set('Authorization', `Bearer ${token}`);
} else {
throw new Error('AzureOpenAI is not instantiated correctly. No API key or token provided.');
}
}
}
const deploymentName = options.deploymentName ?? client.deploymentName;
if (!deploymentName) {
throw new Error('No deployment name provided');
}
const { dangerouslyAllowBrowser } = options;
return new OpenAIRealtimeWebSocket(
{ model: deploymentName, onUrl, ...(dangerouslyAllowBrowser ? { dangerouslyAllowBrowser } : {}) },
client,
);
}

send(event: RealtimeClientEvent) {
Expand Down
35 changes: 31 additions & 4 deletions src/beta/realtime/ws.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import * as WS from 'ws';
import { OpenAI } from '../../index';
import { AzureOpenAI, OpenAI } from '../../index';
import type { RealtimeClientEvent, RealtimeServerEvent } from '../../resources/beta/realtime/realtime';
import { OpenAIRealtimeEmitter, buildRealtimeURL } from './internal-base';
import { OpenAIRealtimeEmitter, buildRealtimeURL, isAzure } from './internal-base';

export class OpenAIRealtimeWS extends OpenAIRealtimeEmitter {
url: URL;
Expand All @@ -14,12 +14,12 @@ export class OpenAIRealtimeWS extends OpenAIRealtimeEmitter {
super();
client ??= new OpenAI();

this.url = buildRealtimeURL({ baseURL: client.baseURL, model: props.model });
this.url = buildRealtimeURL(client, props.model);
this.socket = new WS.WebSocket(this.url, {
...props.options,
headers: {
...props.options?.headers,
Authorization: `Bearer ${client.apiKey}`,
...(isAzure(client) ? {} : { Authorization: `Bearer ${client.apiKey}` }),
'OpenAI-Beta': 'realtime=v1',
},
});
Expand Down Expand Up @@ -51,6 +51,20 @@ export class OpenAIRealtimeWS extends OpenAIRealtimeEmitter {
});
}

static async azure(
client: AzureOpenAI,
options: { deploymentName?: string; options?: WS.ClientOptions | undefined } = {},
): Promise<OpenAIRealtimeWS> {
const deploymentName = options.deploymentName ?? client.deploymentName;
if (!deploymentName) {
throw new Error('No deployment name provided');
}
return new OpenAIRealtimeWS(
{ model: deploymentName, options: { headers: await getAzureHeaders(client) } },
client,
);
}

send(event: RealtimeClientEvent) {
try {
this.socket.send(JSON.stringify(event));
Expand All @@ -67,3 +81,16 @@ export class OpenAIRealtimeWS extends OpenAIRealtimeEmitter {
}
}
}

async function getAzureHeaders(client: AzureOpenAI) {
if (client.apiKey !== '<Missing Key>') {
return { 'api-key': client.apiKey };
} else {
const token = await client._getAzureADToken();
if (token) {
return { Authorization: `Bearer ${token}` };
} else {
throw new Error('AzureOpenAI is not instantiated correctly. No API key or token provided.');
}
}
}
Loading
Loading