Streaming
Stream Merius chat completions token by token as server-sent events, in the OpenAI format. curl, Python, and TypeScript examples.
Set stream: true to receive the completion as a series of server-sent events (SSE). Each event
carries a chunk you can render the moment it arrives, in the same format OpenAI uses.
Enable streaming
Add "stream": true to the request body. The response is a stream of chat-completion chunks; the
content you want is in choices[0].delta.content on each chunk. The stream ends with a
data: [DONE] sentinel.
curl https://api.merius.ai/v1/chat/completions \
-H "Authorization: Bearer $MERIUS_API_KEY" \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "z-ai/glm-5.1",
"stream": true,
"messages": [{"role": "user", "content": "Count to five"}]
}'stream = client.chat.completions.create(
model="z-ai/glm-5.1",
messages=[{"role": "user", "content": "Count to five"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)const stream = await client.chat.completions.create({
model: "z-ai/glm-5.1",
messages: [{ role: "user", content: "Count to five" }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}The event format
Over the wire, each event is a data: line containing one JSON chunk. Concatenating every
delta.content in order reconstructs the full message:
data: {"choices":[{"delta":{"role":"assistant","content":""}}]}
data: {"choices":[{"delta":{"content":"One"}}]}
data: {"choices":[{"delta":{"content":", two"}}]}
data: {"choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]If you use an OpenAI SDK, it parses these events for you — iterate the stream as shown above and
read delta.content.
Quickstart
Make your first Merius API call in curl, Python, or TypeScript. Point the OpenAI client at the base URL and call any model.
Create a chat completion POST
Generate a model response for a conversation. POST a list of messages; the model returns the next assistant message. Supports streaming, tool calling, and structured outputs.