Responses - Routeway Docs

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routeway.ai/v1",
    api_key=os.getenv("ROUTEWAY_API_KEY")
)

response = client.responses.create(
    model="gpt-4o",
    input="What is the capital of France?"
)

print(response.output_text)

{
  "id": "<string>",
  "object": "response",
  "created_at": 123,
  "completed_at": 123,
  "status": "<string>",
  "incomplete_details": {
    "reason": "<string>"
  },
  "model": "<string>",
  "previous_response_id": "<string>",
  "instructions": "<string>",
  "output": [
    {
      "type": "<string>",
      "id": "<string>",
      "content": [
        {
          "type": "<string>",
          "text": "<string>"
        }
      ]
    }
  ],
  "error": {
    "code": "<string>",
    "message": "<string>"
  },
  "tools": [
    {
      "type": "function",
      "name": "<string>",
      "description": "<string>",
      "parameters": {},
      "strict": true
    }
  ],
  "tool_choice": {
    "type": "function",
    "name": "<string>"
  },
  "parallel_tool_calls": true,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "top_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "top_logprobs": 123,
  "temperature": 123,
  "reasoning": {},
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "total_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens_details": {
      "reasoning_tokens": 123
    }
  },
  "max_output_tokens": 123,
  "max_tool_calls": 123,
  "store": true,
  "background": true,
  "service_tier": "<string>",
  "prompt_cache_key": "<string>"
}

POST

responses

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routeway.ai/v1",
    api_key=os.getenv("ROUTEWAY_API_KEY")
)

response = client.responses.create(
    model="gpt-4o",
    input="What is the capital of France?"
)

print(response.output_text)

{
  "id": "<string>",
  "object": "response",
  "created_at": 123,
  "completed_at": 123,
  "status": "<string>",
  "incomplete_details": {
    "reason": "<string>"
  },
  "model": "<string>",
  "previous_response_id": "<string>",
  "instructions": "<string>",
  "output": [
    {
      "type": "<string>",
      "id": "<string>",
      "content": [
        {
          "type": "<string>",
          "text": "<string>"
        }
      ]
    }
  ],
  "error": {
    "code": "<string>",
    "message": "<string>"
  },
  "tools": [
    {
      "type": "function",
      "name": "<string>",
      "description": "<string>",
      "parameters": {},
      "strict": true
    }
  ],
  "tool_choice": {
    "type": "function",
    "name": "<string>"
  },
  "parallel_tool_calls": true,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "top_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "top_logprobs": 123,
  "temperature": 123,
  "reasoning": {},
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "total_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens_details": {
      "reasoning_tokens": 123
    }
  },
  "max_output_tokens": 123,
  "max_tool_calls": 123,
  "store": true,
  "background": true,
  "service_tier": "<string>",
  "prompt_cache_key": "<string>"
}

Generate a unified response block from various input configurations. The Responses API is a higher-level alternative to Chat Completions with built-in tools and optional server-side conversation state.

Create Response

To create a response, use the following endpoint: POST /v1/responses

Request Body

model

string

required

The model ID to use for the response (e.g. "gpt-4o", "gpt-4o-mini"). Only models that advertise /v1/responses in their endpoints array are supported. See Models for the full list.

input

string | array

required

The input for the model. Can be a plain text string or an array of input items (text, image, file, etc.).

instructions

string

A system-level instruction that guides the model’s behavior throughout the response. Equivalent to the system message in Chat Completions.

max_output_tokens

integer

The maximum number of tokens to generate in the output.

temperature

number

Sampling temperature between 0 and 2. Higher values make output more random; lower values make it more deterministic.

stream

boolean

Whether to stream the response back as Server-Sent Events (SSE).

previous_response_id

string

The ID of a previous response to continue from. Enables multi-turn conversations without resending full message history.

tools

array

A list of custom tools the model may call.

tool_choice

string | object

Controls whether and how the model calls tools. Can be "auto", "none", "required", or an object specifying a particular tool.

store

boolean

Whether to store the response server-side so it can be referenced by future requests via previous_response_id. Defaults to true.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routeway.ai/v1",
    api_key=os.getenv("ROUTEWAY_API_KEY")
)

response = client.responses.create(
    model="gpt-4o",
    input="What is the capital of France?"
)

print(response.output_text)

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

x-api-key

string

Body

application/json

model

string | null

input

Show child attributes

previous_response_id

string | null

include

enum<string>[] | null

Available options:

reasoning.encrypted_content,

message.output_text.logprobs

tools

(FunctionToolParam · object | ResponsesToolParam · object)[] | null

FunctionToolParam
ResponsesToolParam

Show child attributes

tool_choice

text

TextParam · object

Show child attributes

temperature

number | null

top_p

number | null

presence_penalty

number | null

frequency_penalty

number | null

parallel_tool_calls

boolean | null

stream

boolean | null

Whether to stream response events as server-sent events.

background

boolean | null

Whether to run the request in the background and return immediately.

max_output_tokens

integer | null

Required range: x >= 16

max_tool_calls

integer | null

Required range: x >= 1

reasoning

ReasoningParam · object

Show child attributes

prompt_cache_key

string | null

Maximum string length: 64

truncation

enum<string> | null

Available options:

auto,

disabled

instructions

string | null

store

boolean | null

Whether to store the response so it can be retrieved later.

service_tier

enum<string> | null

Available options:

auto,

default,

flex,

priority

top_logprobs

integer | null

Required range: 0 <= x <= 20

Response

Successful Response

string

required

The unique ID of the response that was created.

object

enum<string>

required

The object type, which was always response.

Available options:

response

created_at

integer

required

The Unix timestamp (in seconds) for when the response was created.

completed_at

integer | null

required

status

string

required

The status that was set for the response.

incomplete_details

IncompleteDetails · object

required

Show child attributes

model

string

required

The model that generated this response.

previous_response_id

string | null

required

instructions

string | null

required

output

(Message · object | FunctionCall · object | FunctionCallOutput · object | ReasoningBody · object | CompactionBody · object)[]

required

The output items that were generated by the model.

Message
FunctionCall
FunctionCallOutput
ReasoningBody
CompactionBody

Show child attributes

error

Error1 · object

required

Show child attributes

tools

Tool · object[]

required

The tools that were available to the model during response generation.

Show child attributes

tool_choice

required

Show child attributes

truncation

enum<string>

required

Available options:

auto,

disabled

parallel_tool_calls

boolean

required

Whether the model was allowed to call multiple tools in parallel.

text

TextField · object

required

Show child attributes

top_p

number

required

The nucleus sampling parameter that was used for this response.

presence_penalty

number

required

The presence penalty that was used to penalize new tokens based on whether they appear in the text so far.

frequency_penalty

number

required

The frequency penalty that was used to penalize new tokens based on their frequency in the text so far.

top_logprobs

integer

required

The number of most likely tokens that were returned at each position, along with their log probabilities.

temperature

number

required

The sampling temperature that was used for this response.

reasoning

Reasoning · object

required

Show child attributes

usage

Usage · object

required

Show child attributes

max_output_tokens

integer | null

required

max_tool_calls

integer | null

required

store

boolean

required

Whether this response was stored so it can be retrieved later.

background

boolean

required

Whether this request was run in the background.

service_tier

string

required

The service tier that was used for this response.

prompt_cache_key

string | null

required

Chat Completions Anthropic Messages

​Create Response

​Request Body

Authorizations

Headers

Body

Response

Create Response

Request Body