Endpoint
Example
- Python
- JavaScript
- cURL
Request Parameters
Required
| Parameter | Type | Description |
|---|---|---|
model | string | Model ID (e.g., moonshotai/kimi-k2.5). |
input | string or array | User prompt as a string, or conversation history as an array of message objects. |
Sampling
| Parameter | Type | Default | Description |
|---|---|---|---|
instructions | string | null | System-level instructions for the model. |
temperature | float | 1.0 | Sampling temperature (0–2). |
top_p | float | 1.0 | Nucleus sampling threshold. |
max_output_tokens | integer | null | Maximum tokens to generate (including reasoning tokens). |
stream | boolean | false | Stream the response via SSE. |
Structured Output
| Parameter | Type | Default | Description |
|---|---|---|---|
text | object | null | Structured output format with JSON Schema. See example below. |
Tools
The responses endpoint uses a flat tool format —name, description, and parameters are top-level fields, not nested under function.
| Parameter | Type | Default | Description |
|---|---|---|---|
tools | array | null | List of tool definitions (see format below). |
Reasoning
Models with reasoning (like Kimi K2.5 and GLM 5.1) include chain-of-thought by default. The response includes areasoning output item containing the model’s thinking. Reasoning tokens count toward your usage.
Structured Output
Force the model to return JSON matching a schema:- Python
- JavaScript
- cURL
Tool Calling
The responses endpoint uses a flat tool format wherename, description, and parameters are at the top level:
- Python
- JavaScript
- cURL
With Instructions
Useinstructions to set system-level context:
- Python
- cURL
Differences from Chat Completions
| Feature | Chat Completions | Responses |
|---|---|---|
| Input format | messages array | input string or array |
| Tool format | Nested under function | Flat (name/description/parameters at top level) |
| Max tokens param | max_tokens | max_output_tokens |
| Structured output | response_format | text.format |
| Disable reasoning | chat_template_kwargs: {"thinking": false} | Not supported |
| System prompt | system role message | instructions parameter |

