Skip to main content
Routeway gives you access to 200+ models from providers like OpenAI, Anthropic, Google, Mistral, DeepSeek, xAI, and more — all through a single API key and a single base URL.

How Routing Works

When you send a request, Routeway inspects the model field, resolves the correct upstream provider, forwards your request using that provider’s credentials, and returns a normalized response in the standard OpenAI format. You never manage provider credentials, SDKs, or per-provider quirks. Routeway handles it.

Model IDs

Use the provider-native model ID as the model value. Routeway maps it automatically.
# OpenAI model
client.chat.completions.create(model="gpt-4o", ...)

# Anthropic model — same client, same endpoint
client.chat.completions.create(model="claude-opus-4-5", ...)

# Google model
client.chat.completions.create(model="gemini-2.5-pro", ...)

# DeepSeek model
client.chat.completions.create(model="deepseek-chat", ...)
The full list of supported model IDs is on the Models page.

Choosing a Model

No single model is best for every task. Use this table as a starting point:
Use caseRecommended modelsWhy
General chat & Q&Agpt-4o-mini, gemini-2.0-flashFast, cheap, high quality
Complex reasoningo3, o4-mini, claude-opus-4-5Extended thinking, better accuracy
Code generationgpt-4o, kimi-k2.7-code, claude-opus-4-5Strong code benchmarks
Long documentsgemini-2.5-pro, claude-opus-4-51M+ token context windows
Image understandinggpt-4o, gemini-2.5-proBest vision capabilities
Low latency / high throughputgpt-4o-mini, gemini-2.0-flashSub-second TTFT at scale
Cost-sensitive pipelinesdeepseek-chat, gpt-4o-miniLowest cost per token
When evaluating models, run the same set of representative prompts through several candidates and compare quality, latency, and cost. The best model is the one that meets your quality bar at the lowest cost.

Model Capabilities at a Glance

Text generation

All models. The core capability — generating coherent, contextual text from a prompt.

Vision / multimodal

gpt-4o, gpt-4o-mini, claude-opus-4-5, gemini-2.5-pro. Send images and PDFs alongside text.

Tool calling

gpt-4o, gpt-4o-mini, claude-*, gemini-* and most frontier models. Let models invoke your functions.

Structured outputs

gpt-4o, gpt-4o-mini. Guarantees JSON schema conformance.

Reasoning

o3, o4-mini, claude-opus-4-5 (extended thinking), deepseek-r1. Extended deliberation before answering.

Prompt caching

gpt-4o, claude-*, gemini-*. Reuse processed prefix tokens to cut cost and latency.

Switching Models

Because Routeway uses the OpenAI-compatible format, switching models is a one-line change:
# Before
response = client.chat.completions.create(model="gpt-4o", messages=messages)

# After — same client, same code, different model
response = client.chat.completions.create(model="claude-opus-4-5", messages=messages)
Some parameters are model-specific. For example, reasoning_effort applies to OpenAI o-series models, while Anthropic’s extended thinking uses a thinking block. Routeway normalizes responses but does not translate provider-specific request parameters.

Model Versioning

Frontier models are regularly updated. To avoid unexpected behavior changes in production, pin to a specific model snapshot:
# Pinned — behavior is stable
model = "gpt-4o-2024-11-20"

# Floating alias — always the latest, may change
model = "gpt-4o"
Use floating aliases during development to get the latest improvements, then pin to a snapshot before deploying to production.

Pricing

Each model has its own input and output token pricing. Costs are deducted from your Routeway balance per request.
  • Input tokens — tokens in your messages array (including the system prompt)
  • Output tokens — tokens generated by the model
  • Reasoning tokens — internal thinking tokens on reasoning models (billed separately)
  • Cached tokens — prompt tokens served from cache (discounted on supported models)
See the Models page for per-model pricing and the Billing page for balance and usage details.