Skip to main content
The Chat Completions API is the primary way to interact with language models through Routeway. Send a list of messages and receive a model-generated response. The endpoint is fully compatible with OpenAI’s /v1/chat/completions format, so any existing client or SDK works without changes.
Routeway supports the same request and response schema as the OpenAI Chat Completions API. Point your base URL to https://api.routeway.ai/v1 and you’re set.

Quick Example

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routeway.ai/v1",
    api_key=os.getenv("ROUTEWAY_API_KEY")
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

Response Shape

A successful response looks like this:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1718500000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 8,
    "total_tokens": 32
  }
}
FieldDescription
choices[0].message.contentThe model’s text response.
choices[0].finish_reasonWhy the model stopped: stop, length, or tool_calls.
usageToken counts for billing and monitoring.

Request Model

These are the most common fields you can include in a Chat Completions request:
FieldTypeDescription
modelstringThe model to use (e.g. gpt-4o-mini, claude-sonnet-4-20250514).
messagesarrayThe conversation history. Each message has a role and content.
toolsarrayFunction definitions the model can call. See Tool Calling.
response_formatobjectConstrain output to JSON or a JSON Schema. See Structured Outputs.
streambooleanReturn tokens incrementally via SSE. See Streaming.
stream_optionsobjectOptions for streaming (e.g. include_usage).
reasoning_effortstringControl thinking depth for reasoning models. See Reasoning.

Learn More

Streaming

Receive tokens as they are generated for real-time output.

Tool Calling

Let the model invoke functions and external tools.

Structured Outputs

Constrain responses to JSON or a specific schema.

Reasoning

Use extended thinking for complex multi-step problems.

Multimodal Inputs

Send images and other media alongside text messages.

Caching

Reduce costs and latency with prompt caching.

Presets

Apply reusable configurations to your requests.