Chatting

Ask questions and get grounded answers from POST /api/v1/chat.

Audience: developers integrating the chat API. What you will accomplish: build a correct request body, pick the right mode, and read every field of the response.

The endpoint

POST http://127.0.0.1:8000/api/v1/chat
Content-Type: application/json
X-User-Id: <stable id>          # optional, scopes conversation memory

Only q is required. Everything else is an optional per-request override of the server default.

Request body

Field	Type	Default	Notes
`q`	string	—	Your question (1–2000 chars).
`mode`	`strict` \| `open` \| `learning` \| `learning_review`	server default	See the modes below.
`lang`	`auto` \| `en` \| `ar` \| `pt`	`auto`	Force the reply language. `pt` = European Portuguese. Auto-detects Arabic / Portuguese / English.
`top_k`	int 1–10	3	How many document chunks to consider.
`score_threshold`	float 0–1	0.3	Minimum relevance for a chunk to be used.

The four modes

Mode	What it does
strict	Answers only from approved documents. If nothing relevant is found — or the drafted answer is not actually supported — it says it does not have the information. Best for policy/regulated answers.
open	Prefers documents, but may use general knowledge and tells you when it does.
learning	Like open, but when no document matches it synthesizes an answer and saves it to a separate learning store for future questions (never mixed into authoritative answers).
learning_review	Like learning, but synthesized answers are queued for a human to approve before they are saved — unverified answers never enter the knowledge base on their own.

Make the request

curl -X POST http://127.0.0.1:8000/api/v1/chat \
-H "Content-Type: application/json" \
-H "X-User-Id: user_123" \
-d '{"q":"what is the return policy?","mode":"strict","top_k":3}'

const res = await fetch("http://127.0.0.1:8000/api/v1/chat", {
method: "POST",
headers: {
  "Content-Type": "application/json",
  "X-User-Id": "user_123",
},
body: JSON.stringify({ q: "what is the return policy?", mode: "strict", top_k: 3 }),
});
const data = await res.json();
console.log(data.answer, data.meta.grounded, data.sources);

import requests

res = requests.post(
  "http://127.0.0.1:8000/api/v1/chat",
  headers={"X-User-Id": "user_123"},
  json={"q": "what is the return policy?", "mode": "strict", "top_k": 3},
)
data = res.json()
print(data["answer"], data["meta"]["grounded"], data["sources"])

Reading the response

{
  "answer": "Returns are accepted within 30 days of purchase.",
  "sources": [
    {"label": "return_policy.pdf", "doc_id": "return_policy", "score": 0.82, "page": 3, "snippet": "Customers may return..."}
  ],
  "meta": {"mode": "strict", "lang": "en", "self_ingested": false, "grounded": "supported", "grounded_score": 0.83, "correlation_id": "…", "model": "gpt-4o-mini"}
}

answer — the reply text.
sources — the citations behind it. score is relevance (0–1); page/snippet help you verify. In strict mode with no match, sources is [] and the bot declines.
meta.grounded — how well the answer is backed by the cited documents: supported, partial, or unsupported (with grounded_score, the fraction of the answer supported). It reflects whether the answer is true to the sources, not just whether similar text was found. In strict mode an unsupported answer is automatically replaced by a refusal and its sources cleared. It is null when there were no documents to check against.
meta.self_ingested — true only in learning mode when the answer was saved.
meta.correlation_id — quote this in support requests (also returned as the X-Correlation-Id header).

Verify your result

Verify: For an on-topic question with ingested docs, answer is non-empty and sources[] is populated.
Verify: meta.grounded is supported or partial; treat unsupported (open mode) as low confidence.
Verify: Switching mode per request changes behavior without any server config change.

Common mistakes and fixes

Empty q → 422 validation error; check errors[] for the offending field.
Strict mode returns no answer → that is a refusal, not a failure. The question is off-topic for your corpus or the draft was unsupported. Ingest a relevant document or use open. See Trust & citations.
top_k out of range → must be 1–10; score_threshold must be 0–1.

Stream the same request token-by-token in Streaming (SSE).
Turn grounded into a confidence badge in Trust & citations.
Look up status codes in Errors & rate limits.

Chat modes Streaming (SSE)