Configuration

Every environment variable, with its real default and purpose. Grouped to match .env.example. Copy it to .env and override what you need — see Installation.

Audience: operators configuring the service. What you will accomplish: set each variable with confidence.

LLM

Variable	Default	Purpose
`LLM_PROVIDER`	`openai`	Provider: `openai`, `anthropic`, `google`, `groq`, `ollama`, `openrouter`, `together`, `deepseek`, `mistral`, `vllm`, `lmstudio`, `llamacpp`.
`LLM_MODEL`	`gpt-4o-mini`	Model name for the chosen provider.
`LLM_BASE_URL`	(empty)	Override base URL for OpenAI-compatible endpoints (Ollama, OpenRouter, Together, LM Studio, …).
`OPENAI_API_KEY`	`your_openai_api_key_here`	OpenAI key (also reused by several OpenAI-compatible providers with `LLM_BASE_URL`).
`ANTHROPIC_API_KEY`	(empty)	Anthropic key (native provider).
`GOOGLE_API_KEY`	(empty)	Google Gemini key (native provider).
`GROQ_API_KEY`	(empty)	Groq key.

See LLM providers.

Embeddings

Variable	Default	Purpose
`EMBEDDING_PROVIDER`	`openai`	`openai`, `fastembed` (ONNX, ~50MB, zero CVEs), or `huggingface` (torch).
`EMBEDDING_MODEL`	`text-embedding-3-small`	Model for the chosen provider. Changing it requires re-ingesting.

See Embeddings.

Redis

Variable	Default	Purpose
`REDIS_HOST`	`localhost`	Redis hostname.
`REDIS_PORT`	`6379`	Redis port.
`REDIS_PASSWORD`	(empty)	Redis password (if required).
`REDIS_TTL_SECONDS`	`86400`	TTL for conversation memory keys.

Vector DB (ChromaDB)

Variable	Default	Purpose
`CHROMA_PERSIST_DIR`	`./chroma_db`	On-disk persistence directory for ChromaDB.
`CHROMA_COLLECTION`	`policies`	Primary document collection name.

Retrieval

Variable	Default	Purpose
`RETRIEVAL_SCORE_THRESHOLD`	`0.3`	Minimum cosine similarity for the relevance gate. Raise to `0.7` for stricter grounding.
`RETRIEVAL_STRATEGY`	`mmr`	`mmr` (default), `hybrid` (dense + BM25 via RRF), or `hybrid_rerank` (reranker integration point).

See Retrieval.

Query rewrite

Variable	Default	Purpose
`QUERY_REWRITE_ENABLED`	`true`	Condense multi-turn follow-ups into a standalone search query before retrieval (skipped on the first turn).

Groundedness

Variable	Default	Purpose
`GROUNDEDNESS_ENABLED`	`true`	Verify the answer is supported by retrieved chunks; expose `meta.grounded`.
`GROUNDEDNESS_MODE`	`heuristic`	`heuristic` (no extra LLM call) or `llm` (JSON judge).
`GROUNDEDNESS_MIN_SCORE`	`0.5`	Fraction of answer sentences that must be supported.
`STRICT_REFUSE_ON_UNGROUNDED`	`true`	In strict mode, refuse an unsupported answer and clear its sources.

See Groundedness.

Chat mode / self-ingest

Variable	Default	Purpose
`CHAT_MODE`	`strict`	`strict`, `open`, `learning`, or `learning_review` (server default; clients can override per request).
`SELF_INGEST_MIN_LENGTH`	`50`	Minimum answer length (chars) required for auto-ingest in learning modes.

See Chat modes.

Resilience

Variable	Default	Purpose
`PROVIDER_MAX_RETRIES`	`3`	Retries for transient LLM/embedding failures.
`PROVIDER_RETRY_BASE_DELAY`	`0.5`	Seconds; exponential backoff base.
`CIRCUIT_BREAKER_ENABLED`	`true`	Enable the in-process circuit breaker.
`CB_FAILURE_THRESHOLD`	`5`	Consecutive failures before the circuit opens.
`CB_RESET_SECONDS`	`30`	Cool-down before a half-open trial call.

Persona / branding

Variable	Default	Purpose
`ASSISTANT_NAME`	`our company`	Assistant identity used in prompts (default reproduces the original copy).
`KNOWLEDGE_DOMAIN`	(empty)	Optional domain framing, e.g. “returns & shipping policy”. Empty adds none.
`ESCALATION_MESSAGE`	`Please contact support.`	Refusal/escalation copy.

Guardrails

Variable	Default	Purpose
`GUARDRAILS_ENABLED`	`true`	Master switch for input/output guardrails.
`GUARDRAILS_BLOCK_INJECTION`	`true`	Reject likely prompt-injection / jailbreak inputs (returns `400`).
`GUARDRAILS_MASK_PII`	`false`	Mask emails / phone / card-like digit runs in output.
`GUARDRAILS_MAX_ANSWER_CHARS`	`4000`	Hard cap on answer length; `0` disables the cap.

Ingest

Variable	Default	Purpose
`MAX_FILE_SIZE_MB`	`50`	Maximum download/upload size.
`DOWNLOAD_TIMEOUT_SECONDS`	`120`	Timeout for URL document downloads.
`INGEST_MODE`	`inline`	`inline` (in-process `BackgroundTasks`) or `queue` (Redis-backed durable worker).
`INGEST_MAX_ATTEMPTS`	`3`	Retry attempts for queued ingest jobs.
`INGEST_INCOMING_DIR`	`./ingest_incoming`	Shared staging dir for uploads in queue mode (must be a shared volume).

See Deployment.

App / Logging

Variable	Default	Purpose
`DEBUG`	`false`	Debug mode.
`LOG_LEVEL`	`INFO`	Logging level.
`LOG_FORMAT`	`text`	`text` (default) or `json` for log aggregators (Datadog/CloudWatch/ELK).
`CORS_ORIGINS`	`[]`	Allowed cross-origin origins. Default deny; never use `["*"]` in production.

See Observability.

Security

Variable	Default	Purpose
`API_KEY`	(empty)	API key protecting ingest/review endpoints. Empty skips auth (dev mode); `DELETE` always requires it when set.
`REQUIRE_AUTH_FOR_INGEST`	`false`	Require the key on non-DELETE ingest/review/feedback-list endpoints.
`TRUSTED_PROXIES`	`[]`	CIDR ranges of trusted proxies for real-client-IP extraction in rate limiting.
`ALLOWED_HOSTS`	`["*"]`	SSRF allowlist for download hosts; `["*"]` allows any public host (private IPs always blocked).

See Security.

Verify your result

Verify: Your .env sets LLM_PROVIDER + the matching key.
Verify: Production sets API_KEY, CORS_ORIGINS, and TRUSTED_PROXIES explicitly.
Verify: You re-ingest documents whenever EMBEDDING_MODEL changes.

Apply these in Installation.
Harden the security-relevant ones in Security.
See the endpoints they affect in the API summary.

API summary Errors & rate limits