Introduction
Realtime API provides low-latency text/voice real-time conversation via WebSocket with event-based streaming.
WSS wss://llm.ai-nebula.com/v1/realtime?model={model}
Authentication
Bearer Token, e.g. Bearer sk-xxxxxxxxxx
Connection Parameters
Model name: gpt-realtime or gpt-realtime-mini
Basic Info
| Item | Content |
|---|
| Base URL | wss://llm.ai-nebula.com |
| Endpoint | /v1/realtime?model={model} |
| Protocol | WebSocket (JSON event stream) |
| Audio Format | PCM16 mono, 24000Hz sample rate |
Event Types
Client Events
| Event | Description |
|---|
session.update | Set/update session config |
conversation.item.create | Send message |
input_audio_buffer.append | Stream audio |
input_audio_buffer.commit | Commit audio buffer |
response.create | Request response |
Server Events
| Event | Description |
|---|
session.created / session.updated | Session ready or updated |
response.text.delta / response.text.done | Text increments and completion |
response.audio.delta / response.audio.done | Audio increments and completion |
response.done | Turn complete with usage |
error | Error event |
Session Config Example
{
"event_id": "evt_001",
"type": "session.update",
"session": {
"modalities": ["text", "audio"],
"instructions": "You are a friendly assistant",
"voice": "alloy",
"temperature": 0.8,
"input_audio_format": "pcm16",
"output_audio_format": "pcm16"
}
}
Text Message Example
{
"event_id": "evt_002",
"type": "conversation.item.create",
"item": {
"type": "message",
"role": "user",
"content": [
{ "type": "input_text", "text": "Hello, please introduce yourself." }
]
}
}
Python Example
import json, websocket
API_BASE = "wss://llm.ai-nebula.com"
API_KEY = "sk-XyLy**************************mIqSt"
MODEL = "gpt-realtime"
ws = websocket.WebSocketApp(
f"{API_BASE}/v1/realtime?model={MODEL}",
header={"Authorization": f"Bearer {API_KEY}"},
on_message=lambda ws, msg: print("[recv]", msg)
)
ws.on_open = lambda ws: (
ws.send(json.dumps({"type": "session.update", "session": {
"modalities": ["text"],
"instructions": "You are a concise assistant"
}})),
ws.send(json.dumps({"type": "conversation.item.create", "item": {
"type": "message", "role": "user",
"content": [{"type": "input_text", "text": "Introduce Nebula in one sentence."}]
}})),
ws.send(json.dumps({"type": "response.create"}))
)
ws.run_forever()
{ "type": "session.created", "session": { "id": "sess_xxx" } }
{ "type": "response.created", "response": { "id": "resp_xxx" } }
{ "type": "response.text.delta", "delta": "Hello! I am" }
{ "type": "response.text.delta", "delta": " Nebula's realtime assistant." }
{
"type": "response.done",
"response": {
"usage": {
"total_tokens": 123,
"input_tokens": 45,
"output_tokens": 78
}
}
}
Error Handling
| Error Type | Trigger | Message |
|---|
| Auth Failed | Invalid API Key | Check Authorization header |
| Model Not Found | Wrong model name | Only gpt-realtime / gpt-realtime-mini |
| Audio Decode Error | Wrong audio format | Ensure PCM16 mono 24000Hz |
| Connection Lost | WebSocket disconnected | Check network or reconnect |
Notes
- Send
session.update to configure session after connecting
- Call
response.create to trigger generation after sending message
- Audio must be PCM16 mono 24000Hz, base64 encoded
- Requires
websocket-client: pip install websocket-client