Routeway applies rate limits to ensure fair resource allocation and service stability. Rate limit policies depend on whether you are using free models or Pay-As-You-Go models.
Rate Limit Tiers
1. Free Models (Model IDs ending in :free)
Free models (e.g., deepseek-r1:free) are subject to the following rate limits:
- 5 Requests Per Minute (RPM)
- 200 Requests Per Day (RPD)
Once exceeded, requests to free models will return a 429 status code.
2. Pay-As-You-Go Models
For Pay-As-You-Go models (all model IDs without the :free suffix), Routeway does not enforce API-level rate limits.
To protect service availability and prevent abuse, standard rate limits and DDoS protection are active at the network edge.
Rate Limit Response Schema
When you exceed a rate limit, the API returns an HTTP 429 Too Many Requests status code with the following JSON payload:
{
"error": {
"message": "API key per-minute rate limit exceeded (60 RPM).",
"type": "rate_limit_error",
"param": null,
"code": "rate_limit_exceeded",
"status_code": 429,
"id": "b3fca8a1",
"docs": "https://docs.routeway.ai",
"support": "https://discord.gg/RjX2CpdPpd"
}
}
| Field | Description |
|---|
message | Human-readable explanation of which limit was hit. |
type | Always rate_limit_error. |
code | Machine-readable error code (rate_limit_exceeded). |
status_code | HTTP status code (429). |
id | Unique trace/request identifier for debugging. |
docs | Link to this documentation. |
support | Link to our Discord support server. |
Rate-limited responses include headers to help you manage retry timing.
All Rate-Limited Responses
| Header | Description |
|---|
Retry-After | Seconds the client must wait before making another request. |
Minute-based Limits
| Header | Description |
|---|
X-RateLimit-Limit-Minute | Total requests allowed per minute window. |
X-RateLimit-Remaining-Minute | Remaining requests in the current minute window. |
X-RateLimit-Reset-Minute | Seconds until the minute window resets. |
Day-based Limits
| Header | Description |
|---|
X-RateLimit-Limit-Day | Total requests allowed per day. |
X-RateLimit-Remaining-Day | Remaining requests in the current day. |
X-RateLimit-Reset-Day | Seconds until the daily window resets (UTC midnight). |
Best Practices: Handling 429 Errors
We recommend implementing client-side retry logic using exponential backoff with jitter to handle rate limits gracefully.
async function fetchWithRetry(fn, retries = 5, delay = 1000) {
try {
return await fn();
} catch (error) {
if (error.status === 429 && retries > 0) {
// Use Retry-After header if available, otherwise exponential backoff
const retryAfter = error.headers?.get('Retry-After');
const waitMs = retryAfter
? parseInt(retryAfter, 10) * 1000
: delay * 2 + Math.random() * 200;
console.warn(`Rate limited. Retrying in ${Math.round(waitMs)}ms...`);
await new Promise(resolve => setTimeout(resolve, waitMs));
return fetchWithRetry(fn, retries - 1, delay * 2);
}
throw error;
}
}
import time
import random
def call_with_retry(api_func, retries=5, delay=1.0):
for attempt in range(retries):
try:
return api_func()
except Exception as e:
if hasattr(e, 'status_code') and e.status_code == 429 and attempt < retries - 1:
# Use Retry-After header if available, otherwise exponential backoff
retry_after = getattr(e, 'headers', {}).get('Retry-After')
if retry_after:
sleep_time = int(retry_after)
else:
jitter = random.uniform(0, 0.2)
sleep_time = (delay * (2 ** attempt)) + jitter
print(f"Rate limited. Retrying in {sleep_time:.2f}s...")
time.sleep(sleep_time)
else:
raise e