Installation

Stand up a running server so the rest of this guide has something to talk to.

Audience: operators and developers deploying the backend. What you will accomplish: a server reachable at http://127.0.0.1:8000, started from source, from Docker, or as a fully-local stack. Estimated time: 10–20 minutes (longer the first time Docker pulls images or Ollama pulls models).

Prerequisites

Python 3.10 (the project targets python=3.10). Miniconda/conda is the recommended environment manager.
Redis — required for conversation memory, rate limiting, and the ingest registry. The Docker paths include it; for a from-source run you must provide one.
Docker + Docker Compose if you prefer the container paths.
An LLM API key for cloud providers (e.g. OPENAI_API_KEY). The fully-local path below needs none.

Option A — From source (conda + pip)

Step 1: Create and activate the environment
Use Python 3.10 in a dedicated conda environment so dependencies stay isolated.
Step 2: Install dependencies
Install the pinned runtime requirements. Versions are pinned for a reason — they resolve a known numpy/torch incompatibility and a langchain-core CVE.
Step 3: Create your .env
Copy .env.example to .env and fill in at least your LLM provider and key. See Configuration for every variable.
Step 4: Run the server
Start Uvicorn with reload for development. The server listens on 127.0.0.1:8000.

Terminal

conda create -n chat-bot python=3.10
conda activate chat-bot
pip install -r requirements.txt

If it fails: A dependency resolution failure usually means you skipped the pinned requirements.txt — install from it rather than upgrading packages ad hoc.

Terminal

cp .env.example .env
# edit .env: set LLM_PROVIDER, LLM_MODEL, and the matching API key

Terminal

uvicorn main:app --reload

If it fails: A ValueError at startup means a required key is missing for the chosen LLM_PROVIDER — the app validates this on boot rather than failing later.

Expected result

The terminal shows Uvicorn serving on http://127.0.0.1:8000. A GET /health returns ok (or degraded if Redis/ChromaDB are unreachable).

Option B — Docker Compose (api + worker + redis)

The cloud compose file runs three services — api, a durable ingest worker (python -m ingest.worker), and redis — with INGEST_MODE=queue so ingestion survives API restarts.

Terminal

docker-compose up --build

If it fails: If the api container exits on boot, check that the LLM key in your environment / .env matches LLM_PROVIDER.

Option C — Fully-local (Ollama + FastEmbed, zero cloud keys)

For air-gapped, privacy-first, or zero-cost deployment, use the local compose file. It starts Ollama (port 11434, pulls llama3.2 and nomic-embed-text on first start), Redis (port 6379, AOF persistence), and the API (port 8000) pre-configured for local operation.

Terminal

docker-compose -f docker-compose.local.yml up --build

If it fails: First boot is slow while Ollama pulls models. If the API can't reach Ollama, confirm LLM_BASE_URL=http://ollama:11434/v1 inside the compose file.

The local file pre-sets LLM_PROVIDER=ollama, LLM_BASE_URL=http://ollama:11434/v1, EMBEDDING_PROVIDER=fastembed, and EMBEDDING_MODEL=BAAI/bge-small-en-v1.5 — no cloud API keys required.

Verify your result

Verify: curl http://127.0.0.1:8000/health returns ok.
Verify: GET /ready returns 200 with {"status":"ready"} once Redis and ChromaDB respond.
Verify: Your chosen LLM_PROVIDER has its required key set (or is a local provider that needs none).
Verify: For Docker queue mode, the worker container is running alongside api and redis.

Common failure modes

Send your first request in the Quickstart.
Understand every setting in Configuration.
Plan a real deployment in Deployment and Security.

Overview Quickstart

Installation

Prerequisites

Option A — From source (conda + pip)

Expected result

Option B — Docker Compose (api + worker + redis)

Option C — Fully-local (Ollama + FastEmbed, zero cloud keys)

Common failure modes

Related next steps