Models & Providers - Routeway Docs

Routeway gives you access to 200+ models from providers like OpenAI, Anthropic, Google, Mistral, DeepSeek, xAI, and more — all through a single API key and a single base URL.

How Routing Works

When you send a request, Routeway inspects the model field, resolves the correct upstream provider, forwards your request using that provider’s credentials, and returns a normalized response in the standard OpenAI format. You never manage provider credentials, SDKs, or per-provider quirks. Routeway handles it.

Model IDs

Use the provider-native model ID as the model value. Routeway maps it automatically.

# OpenAI model
client.chat.completions.create(model="gpt-4o", ...)

# Anthropic model — same client, same endpoint
client.chat.completions.create(model="claude-opus-4-5", ...)

# Google model
client.chat.completions.create(model="gemini-2.5-pro", ...)

# DeepSeek model
client.chat.completions.create(model="deepseek-chat", ...)

The full list of supported model IDs is on the Models page.

Choosing a Model

No single model is best for every task. Use this table as a starting point:

Use case	Recommended models	Why
General chat & Q&A	`gpt-4o-mini`, `gemini-2.0-flash`	Fast, cheap, high quality
Complex reasoning	`o3`, `o4-mini`, `claude-opus-4-5`	Extended thinking, better accuracy
Code generation	`gpt-4o`, `kimi-k2.7-code`, `claude-opus-4-5`	Strong code benchmarks
Long documents	`gemini-2.5-pro`, `claude-opus-4-5`	1M+ token context windows
Image understanding	`gpt-4o`, `gemini-2.5-pro`	Best vision capabilities
Low latency / high throughput	`gpt-4o-mini`, `gemini-2.0-flash`	Sub-second TTFT at scale
Cost-sensitive pipelines	`deepseek-chat`, `gpt-4o-mini`	Lowest cost per token

When evaluating models, run the same set of representative prompts through several candidates and compare quality, latency, and cost. The best model is the one that meets your quality bar at the lowest cost.

Model Capabilities at a Glance

Text generation

All models. The core capability — generating coherent, contextual text from a prompt.

Vision / multimodal

gpt-4o, gpt-4o-mini, claude-opus-4-5, gemini-2.5-pro. Send images and PDFs alongside text.

Tool calling

gpt-4o, gpt-4o-mini, claude-*, gemini-* and most frontier models. Let models invoke your functions.

Structured outputs

gpt-4o, gpt-4o-mini. Guarantees JSON schema conformance.

Reasoning

o3, o4-mini, claude-opus-4-5 (extended thinking), deepseek-r1. Extended deliberation before answering.

Prompt caching

gpt-4o, claude-*, gemini-*. Reuse processed prefix tokens to cut cost and latency.

Switching Models

Because Routeway uses the OpenAI-compatible format, switching models is a one-line change:

# Before
response = client.chat.completions.create(model="gpt-4o", messages=messages)

# After — same client, same code, different model
response = client.chat.completions.create(model="claude-opus-4-5", messages=messages)

Some parameters are model-specific. For example, reasoning_effort applies to OpenAI o-series models, while Anthropic’s extended thinking uses a thinking block. Routeway normalizes responses but does not translate provider-specific request parameters.

Model Versioning

Frontier models are regularly updated. To avoid unexpected behavior changes in production, pin to a specific model snapshot:

# Pinned — behavior is stable
model = "gpt-4o-2024-11-20"

# Floating alias — always the latest, may change
model = "gpt-4o"

Use floating aliases during development to get the latest improvements, then pin to a snapshot before deploying to production.

Pricing

Each model has its own input and output token pricing. Costs are deducted from your Routeway balance per request.

Input tokens — tokens in your messages array (including the system prompt)
Output tokens — tokens generated by the model
Reasoning tokens — internal thinking tokens on reasoning models (billed separately)
Cached tokens — prompt tokens served from cache (discounted on supported models)

See the Models page for per-model pricing and the Billing page for balance and usage details.

​How Routing Works

​Model IDs

​Choosing a Model

​Model Capabilities at a Glance

Text generation

Vision / multimodal

Tool calling

Structured outputs

Reasoning

Prompt caching

​Switching Models

​Model Versioning

​Pricing

How Routing Works

Model IDs

Choosing a Model

Model Capabilities at a Glance

Switching Models

Model Versioning

Pricing