Responses
Generate a unified response block from various input configurations.
Create Response
To create a response, use the following endpoint:POST /v1/responses
Request Body
"gpt-4o", "gpt-4o-mini"). Only models that
advertise /v1/responses in their endpoints array are supported. See
Models for the full list.system message in Chat Completions."auto", "none", "required",
or an object specifying a particular tool.previous_response_id. Defaults to true.Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Headers
Body
- ItemReferenceParam
- ReasoningItemParam
- CompactionSummaryItemParam
- UserMessageItemParam
- SystemMessageItemParam
- DeveloperMessageItemParam
- AssistantMessageItemParam
- FunctionCallItemParam
- FunctionCallOutputItemParam
reasoning.encrypted_content, message.output_text.logprobs - FunctionToolParam
- ResponsesToolParam
Whether to stream response events as server-sent events.
Whether to run the request in the background and return immediately.
x >= 16x >= 164auto, disabled Whether to store the response so it can be retrieved later.
auto, default, flex, priority 0 <= x <= 20Response
Successful Response
The unique ID of the response that was created.
The object type, which was always response.
response The Unix timestamp (in seconds) for when the response was created.
The status that was set for the response.
The model that generated this response.
The output items that were generated by the model.
- Message
- FunctionCall
- FunctionCallOutput
- ReasoningBody
- CompactionBody
The tools that were available to the model during response generation.
auto, disabled Whether the model was allowed to call multiple tools in parallel.
The nucleus sampling parameter that was used for this response.
The presence penalty that was used to penalize new tokens based on whether they appear in the text so far.
The frequency penalty that was used to penalize new tokens based on their frequency in the text so far.
The number of most likely tokens that were returned at each position, along with their log probabilities.
The sampling temperature that was used for this response.
Whether this response was stored so it can be retrieved later.
Whether this request was run in the background.
The service tier that was used for this response.