OperationsSecurity

Security

What the service defends against, and what you must configure to harden a production deployment.

Audience: operators deploying to production. What you will accomplish: configure auth, SSRF, rate limiting, and CORS correctly.

Authentication

API-key auth via FastAPI dependency injection (middlewares/auth.py):

  • DELETE /api/ingest/{doc_id} always requires X-API-Key when API_KEY is set.
  • Other ingest endpoints require it only when REQUIRE_AUTH_FOR_INGEST=true.
  • When API_KEY is empty, auth is skipped (backward-compatible dev mode).

Review endpoints and the operator feedback-list endpoint honor the same dependency.

API_KEY=your-secret-api-key-here
REQUIRE_AUTH_FOR_INGEST=true

SSRF protection

utils/security.py blocks document downloads from dangerous targets — private IP ranges (10/8, 172.16/12, 192.168/16, 127/8), link-local (169.254/16, fe80::/10), cloud metadata endpoints (169.254.169.254, metadata.google.internal), and loopback (::1). The guard is DNS-aware and does not follow redirects.

ALLOWED_HOSTS controls the allowlist. ["*"] allows any public host (private IPs are still blocked).

Ingest path guard

Uploaded/staged file paths are validated by _validate_ingest_path to prevent traversal outside the intended ingest directory — a defense-in-depth control on top of extension and %PDF-header checks and MAX_FILE_SIZE_MB.

Proxy-aware rate limiting

60 requests/minute per IP, Redis-backed, fails open on Redis error (availability over strict enforcement). Behind a reverse proxy, configure TRUSTED_PROXIES (CIDR ranges) so the real client IP is extracted from X-Forwarded-For / X-Real-IP instead of the proxy’s IP.

TRUSTED_PROXIES=["10.0.0.0/8", "172.16.0.0/12"]

CORS

Default CORS_ORIGINS=[] — no cross-origin access. Production must explicitly opt in.

Startup validation

A Pydantic model_validator ensures the required API keys exist for the chosen LLM_PROVIDER, raising ValueError at startup instead of failing at runtime with opaque errors.

Security audit history

DateScoreGradeScope
2026-05-28 (initial)72/100C+3 critical, 4 high, 5 medium, 5 low, 5 informational findings
2026-05-28 (post-elevation)95/100A+All critical/high resolved; auth, SSRF, rate limiting, CORS, logging, CI hardened

Critical findings resolved included: pinned numpy/torch versions, upgraded langchain-core (≥1.3.3, CVE-2026-44843), default-deny CORS, auth on DELETE, SSRF blocking, trusted-proxy CIDR support, specific exception handling, and startup validators.

Deployer hardening checklist

Before you go to production

  • Verify: Set a strong API_KEY and REQUIRE_AUTH_FOR_INGEST=true.
  • Verify: Set CORS_ORIGINS to your exact origins — never ["*"].
  • Verify: Restrict ALLOWED_HOSTS to the hosts you ingest from (avoid ["*"] where possible).
  • Verify: Configure TRUSTED_PROXIES with your load-balancer CIDRs so rate limiting sees real client IPs.
  • Verify: Confirm the chosen LLM_PROVIDER key is present so startup validation passes.
  • Verify: Run Chroma embedded (not as a network-exposed server) to keep CVE-2026-45829 mitigated.

Common failure modes