Chat modes

The single biggest behavioral switch in the system. CHAT_MODE sets the server default; clients can override it per request with the mode field.

Audience: anyone choosing how the assistant should behave. What you will accomplish: pick the right mode and predict what it does when no document matches.

The four modes

Mode	Behavior	When no docs match	Self-ingest	Use case
strict (default)	Knowledge-base-only	Refuses: “I don’t have information…”	No	Legal, medical, regulated domains
open	Free interaction	Uses general knowledge, honest about provenance	No	General assistants, brainstorming
learning	Free interaction + growing KB	Synthesizes answer, embeds immediately into ChromaDB	Yes (≥50 chars, no docs found)	Knowledge-building, research assistants
learning_review	Same as learning, human-gated	Synthesizes answer, queues for review (not embedded)	Queued for approval (≥50 chars, no docs found)	Curated KB growth with a moderator in the loop

The learning quality gate

Both learning modes only act on a response when both conditions hold:

No documents matched the question (a genuine knowledge gap), and
the answer is ≥50 characters (SELF_INGEST_MIN_LENGTH, configurable).

Short or trivial responses (“I don’t know”, “ok”) are never ingested.

learning vs learning_review (two-phase ingest)

In learning, a passing answer is embedded into synthesized_answers immediately.
In learning_review, it is instead queued in Redis for human review. A moderator lists entries (GET /api/v1/review/pending) and approves (embeds into synthesized_answers, making it retrievable) or rejects (discards). This keeps unverified model output out of the vector store until a human signs off.

See the Review workflow for the operator-facing flow.

Verify your result

Verify: You can state what each mode does when no document matches.
Verify: You know self-ingest only fires on a knowledge gap with an answer ≥50 chars.
Verify: You understand learning embeds immediately while learning_review queues for approval.

Common failure modes

Send a mode per request in Chatting.
Operate the human-in-the-loop in Review workflow.
See how the gate and verification interact in Retrieval and Groundedness.

Groundedness Chatting