Run an agent on Linux
Ollama runs natively on the host (it uses your NVIDIA GPU directly — no
nvidia-container-toolkit needed), and the agent runs as a Docker container that talks to it.
The one Linux-specific step is mapping host.docker.internal.
Prerequisites
1. NVIDIA driver
Install the proprietary NVIDIA driver for your distro (e.g. nvidia-driver-xxx
on Ubuntu, or your distro's package). Verify:
nvidia-smi
Because Ollama runs on the host (not in a container), you do not need the NVIDIA Container Toolkit.
2. Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
ollama run qwen2.5-coder:7b "say hello" # first run pulls the model
By default Ollama listens on 127.0.0.1:11434. So the container can reach it,
bind it on all interfaces — set OLLAMA_HOST=0.0.0.0 for the service:
sudo mkdir -p /etc/systemd/system/ollama.service.d
printf '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0"\n' | \
sudo tee /etc/systemd/system/ollama.service.d/override.conf
sudo systemctl daemon-reload && sudo systemctl restart ollama
See choosing a model for what fits your VRAM.
3. Docker Engine
Install from docs.docker.com/engine/install. Verify:
docker --version
Create your agent
In app.keikaku.ai → Agents → New agent
to get a command with your token filled in. On Linux it includes the
--add-host line so the container can reach host Ollama:
docker run -d --name keikaku-agent --label com.docker.compose.project=keikaku --restart unless-stopped \
--add-host=host.docker.internal:host-gateway \
-e API_BASE_URL=https://api.keikaku.ai \
-e AGENT_TOKEN=<from the app> \
-e OLLAMA_URL=http://host.docker.internal:11434 \
-e MODEL=qwen2.5-coder:7b \
-p 9170:9170 \
ghcr.io/keikaku-ai/agent:latest
Linux difference: unlike Docker Desktop, plain Docker Engine doesn't
provide host.docker.internal automatically — the
--add-host=host.docker.internal:host-gateway flag (already in the app's Linux
snippet) is what makes it resolve to the host.
Verify it connected
The agent shows as online in the app under Agents; the local dashboard is
at http://localhost:9170. Logs:
docker logs -f keikaku-agent
Run multiple agents
One shared Ollama, one model in VRAM, N workers:
docker compose up -d --scale agent=3
Update / stop
docker pull ghcr.io/keikaku-ai/agent:latest && docker restart keikaku-agent
docker stop keikaku-agent # stop
docker rm -f keikaku-agent # remove
Headless server, no host Ollama? The app's Compose bundle has an optional
gpu-ollama profile that runs Ollama in a container with GPU passthrough — for
that you do need the NVIDIA Container Toolkit. Host Ollama is the simpler default.
What the agent does: it executes work generated by your models — writing files and running build/test commands inside its own container and workspace. Outbound HTTPS only (no inbound ports).