Skip to main content
Every request to Routeway is built around a messages array — an ordered list of turns that represents the full conversation context. The model reads the entire array on each call and generates the next turn.

Message Structure

Each message object has two required fields:
{
  "role": "user",
  "content": "What is the capital of France?"
}
FieldTypeDescription
rolestringWho sent the message: system, user, or assistant
contentstring | arrayThe message text, or a mixed array of text and media objects

Roles

system

Sets the model’s behavior, persona, and constraints for the entire conversation. Processed before any user message, giving it the highest priority.
{
  "role": "system",
  "content": "You are a concise technical assistant. Reply in plain text only, no markdown."
}
Use system to:
  • Define the model’s persona or tone
  • Restrict the model to a specific domain
  • Provide background context or instructions that apply globally
Put your system message first and keep it stable across turns. The model treats it as ground truth for the entire session.

user

A message from the end user (or your application acting as the user). This is the primary way to send instructions and content to the model.
{
  "role": "user",
  "content": "Explain how HTTPS works in one paragraph."
}

assistant

A message generated by the model. When building multi-turn conversations, append the model’s previous responses as assistant messages so it can refer back to them.
{
  "role": "assistant",
  "content": "HTTPS uses TLS to encrypt data between a client and server..."
}
You can also inject assistant messages manually — for example, to prime the model with a specific tone or continue a previous conversation without replaying all prior turns.

Multi-turn Conversations

To maintain context across turns, include all previous messages in each new request. The model has no memory between API calls — the messages array is the memory.
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routeway.ai/v1",
    api_key=os.getenv("ROUTEWAY_API_KEY")
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."}
]

# Turn 1
messages.append({"role": "user", "content": "What's the boiling point of water?"})
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
print(reply)

# Turn 2 — model remembers the previous question
messages.append({"role": "user", "content": "And at high altitude?"})
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
print(response.choices[0].message.content)

Rich Content Messages

The content field can be an array of typed objects instead of a plain string. This is how you pass images, PDFs, and mixed media alongside text.
{
  "role": "user",
  "content": [
    { "type": "text", "text": "What does this chart show?" },
    { "type": "image_url", "image_url": { "url": "https://example.com/chart.png" } }
  ]
}
See Multimodal Inputs for the full reference.

Common Patterns

Give the model a name and personality via the system message. Consistent persona instructions reduce drift in long conversations.
{
  "role": "system",
  "content": "You are Aria, a friendly customer support agent for Acme Corp. Always greet users by name when known."
}
Prepend example user/assistant pairs to show the model the exact format you want.
[
  { "role": "user", "content": "Classify: 'I love this product!'" },
  { "role": "assistant", "content": "positive" },
  { "role": "user", "content": "Classify: 'Broke after one day.'" },
  { "role": "assistant", "content": "negative" },
  { "role": "user", "content": "Classify: 'It is okay I guess.'" }
]
Inject retrieved documents or data into a user or system message before the question. This is the foundation of RAG (retrieval-augmented generation).
context = fetch_relevant_docs(user_query)
messages = [
    {"role": "system", "content": "Answer only using the provided context."},
    {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {user_query}"}
]

Context Window Limits

The messages array is bounded by the model’s context window — the maximum number of tokens it can process in a single request. Exceeding this limit causes a 400 error. For long conversations, trim old messages from the middle of the array (keeping the system message and the most recent turns) or use Prompt Caching to reduce costs when the prefix is stable. See Tokens & Context for more detail.