Security
What the service defends against, and what you must configure to harden a production deployment.
Audience: operators deploying to production. What you will accomplish: configure auth, SSRF, rate limiting, and CORS correctly.
Authentication
API-key auth via FastAPI dependency injection (middlewares/auth.py):
DELETE /api/ingest/{doc_id}always requiresX-API-KeywhenAPI_KEYis set.- Other ingest endpoints require it only when
REQUIRE_AUTH_FOR_INGEST=true. - When
API_KEYis empty, auth is skipped (backward-compatible dev mode).
Review endpoints and the operator feedback-list endpoint honor the same dependency.
API_KEY=your-secret-api-key-here
REQUIRE_AUTH_FOR_INGEST=trueSSRF protection
utils/security.py blocks document downloads from dangerous targets — private IP ranges
(10/8, 172.16/12, 192.168/16, 127/8), link-local (169.254/16, fe80::/10), cloud metadata
endpoints (169.254.169.254, metadata.google.internal), and loopback (::1). The guard is
DNS-aware and does not follow redirects.
ALLOWED_HOSTS controls the allowlist. ["*"] allows any public host (private IPs are
still blocked).
Ingest path guard
Uploaded/staged file paths are validated by _validate_ingest_path to prevent traversal
outside the intended ingest directory — a defense-in-depth control on top of extension and
%PDF-header checks and MAX_FILE_SIZE_MB.
Proxy-aware rate limiting
60 requests/minute per IP, Redis-backed, fails open on Redis error (availability over
strict enforcement). Behind a reverse proxy, configure TRUSTED_PROXIES (CIDR ranges) so
the real client IP is extracted from X-Forwarded-For / X-Real-IP instead of the proxy’s
IP.
TRUSTED_PROXIES=["10.0.0.0/8", "172.16.0.0/12"]CORS
Default CORS_ORIGINS=[] — no cross-origin access. Production must explicitly opt in.
Startup validation
A Pydantic model_validator ensures the required API keys exist for the chosen
LLM_PROVIDER, raising ValueError at startup instead of failing at runtime with opaque
errors.
Security audit history
| Date | Score | Grade | Scope |
|---|---|---|---|
| 2026-05-28 (initial) | 72/100 | C+ | 3 critical, 4 high, 5 medium, 5 low, 5 informational findings |
| 2026-05-28 (post-elevation) | 95/100 | A+ | All critical/high resolved; auth, SSRF, rate limiting, CORS, logging, CI hardened |
Critical findings resolved included: pinned numpy/torch versions, upgraded langchain-core (≥1.3.3, CVE-2026-44843), default-deny CORS, auth on DELETE, SSRF blocking, trusted-proxy CIDR support, specific exception handling, and startup validators.
Deployer hardening checklist
Before you go to production
- Verify: Set a strong
API_KEYandREQUIRE_AUTH_FOR_INGEST=true. - Verify: Set
CORS_ORIGINSto your exact origins — never["*"]. - Verify: Restrict
ALLOWED_HOSTSto the hosts you ingest from (avoid["*"]where possible). - Verify: Configure
TRUSTED_PROXIESwith your load-balancer CIDRs so rate limiting sees real client IPs. - Verify: Confirm the chosen
LLM_PROVIDERkey is present so startup validation passes. - Verify: Run Chroma embedded (not as a network-exposed server) to keep CVE-2026-45829 mitigated.
Common failure modes
Related next steps
- See auth in action for ingest in Ingesting documents.
- Configure logging and tracing in Observability.
- Full variable reference in Configuration.