Skip to main content

Introduction

Realtime API provides low-latency text/voice real-time conversation via WebSocket with event-based streaming.
WSS wss://llm.ai-nebula.com/v1/realtime?model={model}

Authentication

Authorization
string
required
Bearer Token, e.g. Bearer sk-xxxxxxxxxx

Connection Parameters

model
string
required
Model name: gpt-realtime or gpt-realtime-mini

Basic Info

ItemContent
Base URLwss://llm.ai-nebula.com
Endpoint/v1/realtime?model={model}
ProtocolWebSocket (JSON event stream)
Audio FormatPCM16 mono, 24000Hz sample rate

Event Types

Client Events

EventDescription
session.updateSet/update session config
conversation.item.createSend message
input_audio_buffer.appendStream audio
input_audio_buffer.commitCommit audio buffer
response.createRequest response

Server Events

EventDescription
session.created / session.updatedSession ready or updated
response.text.delta / response.text.doneText increments and completion
response.audio.delta / response.audio.doneAudio increments and completion
response.doneTurn complete with usage
errorError event

Session Config Example

{
  "event_id": "evt_001",
  "type": "session.update",
  "session": {
    "modalities": ["text", "audio"],
    "instructions": "You are a friendly assistant",
    "voice": "alloy",
    "temperature": 0.8,
    "input_audio_format": "pcm16",
    "output_audio_format": "pcm16"
  }
}

Text Message Example

{
  "event_id": "evt_002",
  "type": "conversation.item.create",
  "item": {
    "type": "message",
    "role": "user",
    "content": [
      { "type": "input_text", "text": "Hello, please introduce yourself." }
    ]
  }
}

Python Example

import json, websocket

API_BASE = "wss://llm.ai-nebula.com"
API_KEY = "sk-XyLy**************************mIqSt"
MODEL = "gpt-realtime"

ws = websocket.WebSocketApp(
    f"{API_BASE}/v1/realtime?model={MODEL}",
    header={"Authorization": f"Bearer {API_KEY}"},
    on_message=lambda ws, msg: print("[recv]", msg)
)

ws.on_open = lambda ws: (
    ws.send(json.dumps({"type": "session.update", "session": {
        "modalities": ["text"],
        "instructions": "You are a concise assistant"
    }})),
    ws.send(json.dumps({"type": "conversation.item.create", "item": {
        "type": "message", "role": "user",
        "content": [{"type": "input_text", "text": "Introduce Nebula in one sentence."}]
    }})),
    ws.send(json.dumps({"type": "response.create"}))
)

ws.run_forever()
{ "type": "session.created", "session": { "id": "sess_xxx" } }
{ "type": "response.created", "response": { "id": "resp_xxx" } }
{ "type": "response.text.delta", "delta": "Hello! I am" }
{ "type": "response.text.delta", "delta": " Nebula's realtime assistant." }
{
  "type": "response.done",
  "response": {
    "usage": {
      "total_tokens": 123,
      "input_tokens": 45,
      "output_tokens": 78
    }
  }
}

Error Handling

Error TypeTriggerMessage
Auth FailedInvalid API KeyCheck Authorization header
Model Not FoundWrong model nameOnly gpt-realtime / gpt-realtime-mini
Audio Decode ErrorWrong audio formatEnsure PCM16 mono 24000Hz
Connection LostWebSocket disconnectedCheck network or reconnect

Notes

  • Send session.update to configure session after connecting
  • Call response.create to trigger generation after sending message
  • Audio must be PCM16 mono 24000Hz, base64 encoded
  • Requires websocket-client: pip install websocket-client