创建 Responses 请求

简介

Responses API 是 OpenAI 推出的新一代对话接口，专为推理模型（o 系列、GPT-5 系列）和高级功能设计。相比传统的 Chat Completions API，Responses API 提供了更精细的推理控制、内置工具支持和多模态输入能力。

适用场景

推理密集型任务：使用 o1、o3-mini、o4-mini、GPT-5 等推理模型
需要联网搜索：内置 Web Search Preview 工具
高级工具调用：支持 Function Call 和 Custom Tool Call
多轮对话延续：通过 previous_response_id 实现对话历史管理

认证

Authorization

string

必填

Bearer Token，如 Bearer sk-xxxxxxxxxx

请求参数

model

string

必填

模型标识，支持的模型包括：

GPT-5 系列：gpt-5.2、gpt-5、gpt-5-mini 等
o 系列：o1、o3-mini、o4-mini 等
GPT-4 系列：gpt-4o、gpt-4.1、gpt-4o-mini 等

input

array

必填

输入消息列表，支持多种格式：

简化格式：[{"role": "user", "content": "文本"}]（类似 Chat Completions）
标准格式：[{"type": "input_text", "text": "文本"}]
多模态：支持 input_image、input_file 类型

instructions

string

系统指令，等同于 Chat Completions 中的 system message

max_output_tokens

number

最大输出 token 数，控制回复长度

stream

boolean

默认值:"false"

是否启用流式输出，返回 SSE 格式的分片数据

temperature

number

默认值:"1.0"

随机性控制，0-2，值越高回复越随机

top_p

number

默认值:"0.98"

核采样参数，0-1，控制生成的多样性

reasoning

object

推理配置，用于控制推理模型的行为：

effort：推理力度，可选 "none"、"low"、"medium"、"high"
summary：推理摘要，可选 "auto"、"none"、"detailed"

tools

array

工具列表，支持三种类型：

内置 Web 搜索工具：{"type": "web_search_preview", "search_context_size": "medium"}
内置文件搜索工具：{"type": "file_search"}
自定义函数工具：标准 OpenAI Function Call 格式

tool_choice

string|object

默认值:"auto"

工具选择策略：

"auto"：由模型自动决定是否调用工具
"none"：禁用工具调用
{"type": "function", "function": {"name": "函数名"}}：强制调用指定函数

parallel_tool_calls

boolean

默认值:"true"

是否允许并行调用多个工具

max_tool_calls

number

最大工具调用次数限制

previous_response_id

string

前一个响应的 ID，用于延续对话历史

truncation

string

默认值:"disabled"

截断策略："auto" 或 "disabled"

metadata

object

请求元数据，用于跟踪和调试

user

string

用户标识符

基础示例

简单对话（非流式）
简单对话（流式）
Python SDK

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "max_output_tokens": 2048,
    "input": [
      {"role": "system", "content": "你是一个有用的助手"},
      {"role": "user", "content": "请用中文简要介绍人工智能"}
    ]
  }'

curl -N -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "stream": true,
    "max_output_tokens": 2048,
    "input": [
      {"role": "user", "content": "请用中文简要介绍人工智能"}
    ]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxxxxxxxxx",
    base_url="https://llm.ai-nebula.com/v1"
)

# 使用 Responses API
response = client.responses.create(
    model="gpt-5.2",
    max_output_tokens=2048,
    input=[
        {"role": "user", "content": "请用中文简要介绍人工智能"}
    ]
)

print(response.output[0].content[0].text)

响应格式

非流式响应

{
  "id": "resp_xxx",
  "object": "response",
  "created_at": 1768271369,
  "model": "gpt-5.2",
  "status": "completed",
  "output": [
    {
      "id": "msg_xxx",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "人工智能（AI）是计算机科学的一个分支...",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 25,
    "output_tokens": 150,
    "total_tokens": 175,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 50
    }
  }
}

流式响应（SSE 事件）

流式响应使用 Server-Sent Events 格式，包含以下事件类型：

事件类型	说明
`response.created`	响应创建
`response.in_progress`	响应进行中
`response.output_item.added`	输出项添加（工具调用开始）
`response.output_text.delta`	文本增量
`response.output_text.done`	文本完成
`response.output_item.done`	输出项完成
`response.completed`	响应完成

示例 SSE 输出：

event: response.created
data: {"type":"response.created","response":{"id":"resp_xxx","status":"in_progress"}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"人工","sequence_number":1}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"智能","sequence_number":2}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_xxx","status":"completed","usage":{...}}}

高级功能

1. 联网搜索（Web Search）

启用内置的 Web 搜索工具，让模型可以实时搜索互联网信息。

基础示例
高级配置

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "stream": true,
    "max_output_tokens": 2048,
    "input": [
      {"role": "user", "content": "今天的新闻头条是什么？"}
    ],
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "medium"
      }
    ]
  }'

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "stream": true,
    "input": [
      {"role": "user", "content": "纽约今天的天气如何？"}
    ],
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "high",
        "user_location": {
          "type": "approximate",
          "country": "US",
          "region": "NY",
          "city": "New York",
          "timezone": "America/New_York"
        }
      }
    ]
  }'

Web Search 参数说明：

search_context_size：搜索上下文大小
- "low"：低上下文，更快但结果较少
- "medium"：中等上下文（默认）
- "high"：高上下文，更多搜索结果但更慢
user_location（可选）：用户位置信息
- country：国家代码（如 “US”、“CN”）
- region：州/省份
- city：城市
- timezone：时区

2. 推理控制（Reasoning）

控制推理模型的思考深度和输出格式。

自动推理摘要
详细推理过程

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "o4-mini",
    "stream": true,
    "reasoning": {
      "summary": "auto"
    },
    "max_output_tokens": 8192,
    "input": [
      {"role": "user", "content": "汉诺塔的公式是什么？"}
    ]
  }'

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "o4-mini",
    "stream": true,
    "reasoning": {
      "effort": "high",
      "summary": "detailed"
    },
    "max_output_tokens": 16384,
    "input": [
      {"role": "user", "content": "证明费马大定理"}
    ]
  }'

Reasoning 参数说明：

effort：推理力度
- "none"：不进行推理
- "low"：轻量推理
- "medium"：中等推理（默认）
- "high"：深度推理
summary：推理摘要
- "none"：不输出推理摘要
- "auto"：自动决定是否输出摘要
- "detailed"：输出详细推理过程

3. 自定义函数调用

支持标准的 OpenAI Function Calling 格式。

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "stream": true,
    "input": [
      {"role": "user", "content": "上海的天气怎么样？"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "获取指定城市的天气信息",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {
                "type": "string",
                "description": "城市名称"
              }
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

函数调用响应格式：

{
  "output": [
    {
      "id": "call_xxx",
      "type": "function_call",
      "status": "completed",
      "name": "get_weather",
      "call_id": "call_xxx",
      "arguments": "{\"city\":\"上海\"}"
    }
  ]
}

4. 多模态输入

支持文本、图片、文件等多种输入类型。

图片输入
文件输入

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "input": [
      {
        "type": "input_text",
        "text": "这张图片里有什么？"
      },
      {
        "type": "input_image",
        "image_url": "https://example.com/image.jpg",
        "detail": "high"
      }
    ]
  }'

curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "input": [
      {
        "type": "input_text",
        "text": "请分析这个PDF文档的内容"
      },
      {
        "type": "input_file",
        "file_url": "https://example.com/document.pdf"
      }
    ]
  }'

5. 对话延续

使用 previous_response_id 延续之前的对话。

# 第一次对话
curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "input": [
      {"role": "user", "content": "什么是量子计算？"}
    ]
  }'

# 响应包含 id: "resp_abc123"

# 第二次对话（延续）
curl -X POST "https://llm.ai-nebula.com/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "gpt-5.2",
    "previous_response_id": "resp_abc123",
    "input": [
      {"role": "user", "content": "它的应用场景有哪些？"}
    ]
  }'

注意事项

模型兼容性：并非所有模型都支持 Responses API 的全部功能
Web Search：仅 GPT-4o、GPT-4.1、GPT-5 和 o 系列模型支持
推理功能：仅 o 系列和部分 GPT-5 系列模型支持 reasoning 参数
格式混淆：流式响应中的 delta 可能包含 obfuscation 字段（内容混淆保护），完整明文在 response.output_text.done 事件中

如果您需要标准的 Chat Completions 格式，可以使用 /v1/chat/completions 接口 + openai/ 模型前缀
系统会自动转换格式，提供更好的客户端兼容性

对比：Responses API vs Chat Completions API

特性	Responses API	Chat Completions API
推理模型支持	✅ 完整支持	⚠️ 有限支持
内置 Web Search	✅ 原生支持	❌ 不支持
推理控制	✅ 精细控制	❌ 不支持
对话延续	✅ `previous_response_id`	✅ 通过 messages
流式输出	✅ SSE 格式	✅ SSE 格式
客户端兼容性	⚠️ 需要适配	✅ 标准格式
适用场景	推理、搜索、高级功能	通用对话

Chat Completions API

标准对话接口文档

模型列表

查看所有支持的模型

常见问题

Responses API 常见问题

API 文档

文本系列

图像系列

视频系列

实时语音

简介

适用场景

认证

请求参数

基础示例

响应格式

非流式响应

流式响应（SSE 事件）

高级功能

1. 联网搜索（Web Search）

2. 推理控制（Reasoning）

3. 自定义函数调用

4. 多模态输入

5. 对话延续

注意事项

对比：Responses API vs Chat Completions API

相关资源

Chat Completions API

模型列表

常见问题

API 文档

文本系列

图像系列

视频系列

实时语音

​简介

​适用场景

​认证

​请求参数

​基础示例

​响应格式

​非流式响应

​流式响应（SSE 事件）

​高级功能

​1. 联网搜索（Web Search）

​2. 推理控制（Reasoning）

​3. 自定义函数调用

​4. 多模态输入

​5. 对话延续

​注意事项

​对比：Responses API vs Chat Completions API

​相关资源

Chat Completions API

模型列表

常见问题

简介

适用场景

认证

请求参数

基础示例

响应格式

非流式响应

流式响应（SSE 事件）

高级功能

1. 联网搜索（Web Search）

2. 推理控制（Reasoning）

3. 自定义函数调用

4. 多模态输入

5. 对话延续

注意事项

对比：Responses API vs Chat Completions API

相关资源