Chatting
Ask questions and get grounded answers from POST /api/v1/chat.
Audience: developers integrating the chat API. What you will accomplish: build a correct request body, pick the right mode, and read every field of the response.
The endpoint
POST http://127.0.0.1:8000/api/v1/chat
Content-Type: application/json
X-User-Id: <stable id> # optional, scopes conversation memoryOnly q is required. Everything else is an optional per-request override of the server
default.
Request body
| Field | Type | Default | Notes |
|---|---|---|---|
q | string | — | Your question (1–2000 chars). |
mode | strict | open | learning | learning_review | server default | See the modes below. |
lang | auto | en | ar | pt | auto | Force the reply language. pt = European Portuguese. Auto-detects Arabic / Portuguese / English. |
top_k | int 1–10 | 3 | How many document chunks to consider. |
score_threshold | float 0–1 | 0.3 | Minimum relevance for a chunk to be used. |
The four modes
| Mode | What it does |
|---|---|
| strict | Answers only from approved documents. If nothing relevant is found — or the drafted answer is not actually supported — it says it does not have the information. Best for policy/regulated answers. |
| open | Prefers documents, but may use general knowledge and tells you when it does. |
| learning | Like open, but when no document matches it synthesizes an answer and saves it to a separate learning store for future questions (never mixed into authoritative answers). |
| learning_review | Like learning, but synthesized answers are queued for a human to approve before they are saved — unverified answers never enter the knowledge base on their own. |
Make the request
curl -X POST http://127.0.0.1:8000/api/v1/chat \
-H "Content-Type: application/json" \
-H "X-User-Id: user_123" \
-d '{"q":"what is the return policy?","mode":"strict","top_k":3}'const res = await fetch("http://127.0.0.1:8000/api/v1/chat", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-User-Id": "user_123",
},
body: JSON.stringify({ q: "what is the return policy?", mode: "strict", top_k: 3 }),
});
const data = await res.json();
console.log(data.answer, data.meta.grounded, data.sources);import requests
res = requests.post(
"http://127.0.0.1:8000/api/v1/chat",
headers={"X-User-Id": "user_123"},
json={"q": "what is the return policy?", "mode": "strict", "top_k": 3},
)
data = res.json()
print(data["answer"], data["meta"]["grounded"], data["sources"])Reading the response
{
"answer": "Returns are accepted within 30 days of purchase.",
"sources": [
{"label": "return_policy.pdf", "doc_id": "return_policy", "score": 0.82, "page": 3, "snippet": "Customers may return..."}
],
"meta": {"mode": "strict", "lang": "en", "self_ingested": false, "grounded": "supported", "grounded_score": 0.83, "correlation_id": "…", "model": "gpt-4o-mini"}
}answer— the reply text.sources— the citations behind it.scoreis relevance (0–1);page/snippethelp you verify. In strict mode with no match,sourcesis[]and the bot declines.meta.grounded— how well the answer is backed by the cited documents:supported,partial, orunsupported(withgrounded_score, the fraction of the answer supported). It reflects whether the answer is true to the sources, not just whether similar text was found. In strict mode anunsupportedanswer is automatically replaced by a refusal and its sources cleared. It isnullwhen there were no documents to check against.meta.self_ingested—trueonly in learning mode when the answer was saved.meta.correlation_id— quote this in support requests (also returned as theX-Correlation-Idheader).
Verify your result
- Verify: For an on-topic question with ingested docs,
answeris non-empty andsources[]is populated. - Verify:
meta.groundedissupportedorpartial; treatunsupported(open mode) as low confidence. - Verify: Switching
modeper request changes behavior without any server config change.
Common mistakes and fixes
- Empty
q→422validation error; checkerrors[]for the offending field. - Strict mode returns no answer → that is a refusal, not a failure. The question is
off-topic for your corpus or the draft was unsupported. Ingest a relevant document or use
open. See Trust & citations. top_kout of range → must be 1–10;score_thresholdmust be 0–1.
Related next steps
- Stream the same request token-by-token in Streaming (SSE).
- Turn
groundedinto a confidence badge in Trust & citations. - Look up status codes in Errors & rate limits.