Installation
Stand up a running server so the rest of this guide has something to talk to.
Audience: operators and developers deploying the backend.
What you will accomplish: a server reachable at http://127.0.0.1:8000, started from
source, from Docker, or as a fully-local stack.
Estimated time: 10–20 minutes (longer the first time Docker pulls images or Ollama
pulls models).
Prerequisites
- Python 3.10 (the project targets
python=3.10). Miniconda/conda is the recommended environment manager. - Redis — required for conversation memory, rate limiting, and the ingest registry. The Docker paths include it; for a from-source run you must provide one.
- Docker + Docker Compose if you prefer the container paths.
- An LLM API key for cloud providers (e.g.
OPENAI_API_KEY). The fully-local path below needs none.
Option A — From source (conda + pip)
Step 1: Create and activate the environment
Use Python 3.10 in a dedicated conda environment so dependencies stay isolated.
Step 2: Install dependencies
Install the pinned runtime requirements. Versions are pinned for a reason — they resolve a known numpy/torch incompatibility and a langchain-core CVE.
Step 3: Create your .env
Copy
.env.exampleto.envand fill in at least your LLM provider and key. See Configuration for every variable.Step 4: Run the server
Start Uvicorn with reload for development. The server listens on
127.0.0.1:8000.
conda create -n chat-bot python=3.10
conda activate chat-bot
pip install -r requirements.txtIf it fails: A dependency resolution failure usually means you skipped the pinned requirements.txt — install from it rather than upgrading packages ad hoc.
cp .env.example .env
# edit .env: set LLM_PROVIDER, LLM_MODEL, and the matching API keyuvicorn main:app --reloadIf it fails: A ValueError at startup means a required key is missing for the chosen LLM_PROVIDER — the app validates this on boot rather than failing later.
Expected result
The terminal shows Uvicorn serving on http://127.0.0.1:8000. A GET /health returns
ok (or degraded if Redis/ChromaDB are unreachable).
Option B — Docker Compose (api + worker + redis)
The cloud compose file runs three services — api, a durable ingest worker
(python -m ingest.worker), and redis — with INGEST_MODE=queue so ingestion survives
API restarts.
docker-compose up --buildIf it fails: If the api container exits on boot, check that the LLM key in your environment / .env matches LLM_PROVIDER.
Option C — Fully-local (Ollama + FastEmbed, zero cloud keys)
For air-gapped, privacy-first, or zero-cost deployment, use the local compose file. It
starts Ollama (port 11434, pulls llama3.2 and nomic-embed-text on first start),
Redis (port 6379, AOF persistence), and the API (port 8000) pre-configured for
local operation.
docker-compose -f docker-compose.local.yml up --buildIf it fails: First boot is slow while Ollama pulls models. If the API can't reach Ollama, confirm LLM_BASE_URL=http://ollama:11434/v1 inside the compose file.
The local file pre-sets LLM_PROVIDER=ollama, LLM_BASE_URL=http://ollama:11434/v1,
EMBEDDING_PROVIDER=fastembed, and EMBEDDING_MODEL=BAAI/bge-small-en-v1.5 — no cloud
API keys required.
Verify your result
- Verify:
curl http://127.0.0.1:8000/healthreturnsok. - Verify:
GET /readyreturns200with{"status":"ready"}once Redis and ChromaDB respond. - Verify: Your chosen
LLM_PROVIDERhas its required key set (or is a local provider that needs none). - Verify: For Docker queue mode, the
workercontainer is running alongsideapiandredis.
Common failure modes
Related next steps
- Send your first request in the Quickstart.
- Understand every setting in Configuration.
- Plan a real deployment in Deployment and Security.