ReferenceConfiguration

Configuration

Every environment variable, with its real default and purpose. Grouped to match .env.example. Copy it to .env and override what you need — see Installation.

Audience: operators configuring the service. What you will accomplish: set each variable with confidence.

LLM

VariableDefaultPurpose
LLM_PROVIDERopenaiProvider: openai, anthropic, google, groq, ollama, openrouter, together, deepseek, mistral, vllm, lmstudio, llamacpp.
LLM_MODELgpt-4o-miniModel name for the chosen provider.
LLM_BASE_URL(empty)Override base URL for OpenAI-compatible endpoints (Ollama, OpenRouter, Together, LM Studio, …).
OPENAI_API_KEYyour_openai_api_key_hereOpenAI key (also reused by several OpenAI-compatible providers with LLM_BASE_URL).
ANTHROPIC_API_KEY(empty)Anthropic key (native provider).
GOOGLE_API_KEY(empty)Google Gemini key (native provider).
GROQ_API_KEY(empty)Groq key.

See LLM providers.

Embeddings

VariableDefaultPurpose
EMBEDDING_PROVIDERopenaiopenai, fastembed (ONNX, ~50MB, zero CVEs), or huggingface (torch).
EMBEDDING_MODELtext-embedding-3-smallModel for the chosen provider. Changing it requires re-ingesting.

See Embeddings.

Redis

VariableDefaultPurpose
REDIS_HOSTlocalhostRedis hostname.
REDIS_PORT6379Redis port.
REDIS_PASSWORD(empty)Redis password (if required).
REDIS_TTL_SECONDS86400TTL for conversation memory keys.

Vector DB (ChromaDB)

VariableDefaultPurpose
CHROMA_PERSIST_DIR./chroma_dbOn-disk persistence directory for ChromaDB.
CHROMA_COLLECTIONpoliciesPrimary document collection name.

Retrieval

VariableDefaultPurpose
RETRIEVAL_SCORE_THRESHOLD0.3Minimum cosine similarity for the relevance gate. Raise to 0.7 for stricter grounding.
RETRIEVAL_STRATEGYmmrmmr (default), hybrid (dense + BM25 via RRF), or hybrid_rerank (reranker integration point).

See Retrieval.

Query rewrite

VariableDefaultPurpose
QUERY_REWRITE_ENABLEDtrueCondense multi-turn follow-ups into a standalone search query before retrieval (skipped on the first turn).

Groundedness

VariableDefaultPurpose
GROUNDEDNESS_ENABLEDtrueVerify the answer is supported by retrieved chunks; expose meta.grounded.
GROUNDEDNESS_MODEheuristicheuristic (no extra LLM call) or llm (JSON judge).
GROUNDEDNESS_MIN_SCORE0.5Fraction of answer sentences that must be supported.
STRICT_REFUSE_ON_UNGROUNDEDtrueIn strict mode, refuse an unsupported answer and clear its sources.

See Groundedness.

Chat mode / self-ingest

VariableDefaultPurpose
CHAT_MODEstrictstrict, open, learning, or learning_review (server default; clients can override per request).
SELF_INGEST_MIN_LENGTH50Minimum answer length (chars) required for auto-ingest in learning modes.

See Chat modes.

Resilience

VariableDefaultPurpose
PROVIDER_MAX_RETRIES3Retries for transient LLM/embedding failures.
PROVIDER_RETRY_BASE_DELAY0.5Seconds; exponential backoff base.
CIRCUIT_BREAKER_ENABLEDtrueEnable the in-process circuit breaker.
CB_FAILURE_THRESHOLD5Consecutive failures before the circuit opens.
CB_RESET_SECONDS30Cool-down before a half-open trial call.

Persona / branding

VariableDefaultPurpose
ASSISTANT_NAMEour companyAssistant identity used in prompts (default reproduces the original copy).
KNOWLEDGE_DOMAIN(empty)Optional domain framing, e.g. “returns & shipping policy”. Empty adds none.
ESCALATION_MESSAGEPlease contact support.Refusal/escalation copy.

Guardrails

VariableDefaultPurpose
GUARDRAILS_ENABLEDtrueMaster switch for input/output guardrails.
GUARDRAILS_BLOCK_INJECTIONtrueReject likely prompt-injection / jailbreak inputs (returns 400).
GUARDRAILS_MASK_PIIfalseMask emails / phone / card-like digit runs in output.
GUARDRAILS_MAX_ANSWER_CHARS4000Hard cap on answer length; 0 disables the cap.

Ingest

VariableDefaultPurpose
MAX_FILE_SIZE_MB50Maximum download/upload size.
DOWNLOAD_TIMEOUT_SECONDS120Timeout for URL document downloads.
INGEST_MODEinlineinline (in-process BackgroundTasks) or queue (Redis-backed durable worker).
INGEST_MAX_ATTEMPTS3Retry attempts for queued ingest jobs.
INGEST_INCOMING_DIR./ingest_incomingShared staging dir for uploads in queue mode (must be a shared volume).

See Deployment.

App / Logging

VariableDefaultPurpose
DEBUGfalseDebug mode.
LOG_LEVELINFOLogging level.
LOG_FORMATtexttext (default) or json for log aggregators (Datadog/CloudWatch/ELK).
CORS_ORIGINS[]Allowed cross-origin origins. Default deny; never use ["*"] in production.

See Observability.

Security

VariableDefaultPurpose
API_KEY(empty)API key protecting ingest/review endpoints. Empty skips auth (dev mode); DELETE always requires it when set.
REQUIRE_AUTH_FOR_INGESTfalseRequire the key on non-DELETE ingest/review/feedback-list endpoints.
TRUSTED_PROXIES[]CIDR ranges of trusted proxies for real-client-IP extraction in rate limiting.
ALLOWED_HOSTS["*"]SSRF allowlist for download hosts; ["*"] allows any public host (private IPs always blocked).

See Security.

Verify your result

  • Verify: Your .env sets LLM_PROVIDER + the matching key.
  • Verify: Production sets API_KEY, CORS_ORIGINS, and TRUSTED_PROXIES explicitly.
  • Verify: You re-ingest documents whenever EMBEDDING_MODEL changes.