How Routing Works
When you send a request, Routeway inspects themodel field, resolves the correct upstream provider, forwards your request using that provider’s credentials, and returns a normalized response in the standard OpenAI format.
You never manage provider credentials, SDKs, or per-provider quirks. Routeway handles it.
Model IDs
Use the provider-native model ID as themodel value. Routeway maps it automatically.
Choosing a Model
No single model is best for every task. Use this table as a starting point:| Use case | Recommended models | Why |
|---|---|---|
| General chat & Q&A | gpt-4o-mini, gemini-2.0-flash | Fast, cheap, high quality |
| Complex reasoning | o3, o4-mini, claude-opus-4-5 | Extended thinking, better accuracy |
| Code generation | gpt-4o, kimi-k2.7-code, claude-opus-4-5 | Strong code benchmarks |
| Long documents | gemini-2.5-pro, claude-opus-4-5 | 1M+ token context windows |
| Image understanding | gpt-4o, gemini-2.5-pro | Best vision capabilities |
| Low latency / high throughput | gpt-4o-mini, gemini-2.0-flash | Sub-second TTFT at scale |
| Cost-sensitive pipelines | deepseek-chat, gpt-4o-mini | Lowest cost per token |
Model Capabilities at a Glance
Text generation
All models. The core capability — generating coherent, contextual text from a prompt.
Vision / multimodal
gpt-4o, gpt-4o-mini, claude-opus-4-5, gemini-2.5-pro. Send images and PDFs alongside text.Tool calling
gpt-4o, gpt-4o-mini, claude-*, gemini-* and most frontier models. Let models invoke your functions.Structured outputs
gpt-4o, gpt-4o-mini. Guarantees JSON schema conformance.Reasoning
o3, o4-mini, claude-opus-4-5 (extended thinking), deepseek-r1. Extended deliberation before answering.Prompt caching
gpt-4o, claude-*, gemini-*. Reuse processed prefix tokens to cut cost and latency.Switching Models
Because Routeway uses the OpenAI-compatible format, switching models is a one-line change:Model Versioning
Frontier models are regularly updated. To avoid unexpected behavior changes in production, pin to a specific model snapshot:Pricing
Each model has its own input and output token pricing. Costs are deducted from your Routeway balance per request.- Input tokens — tokens in your
messagesarray (including the system prompt) - Output tokens — tokens generated by the model
- Reasoning tokens — internal thinking tokens on reasoning models (billed separately)
- Cached tokens — prompt tokens served from cache (discounted on supported models)