DIY AI Gateway: OpenClaw on Raspberry Pi Kubernetes Cluster
raspberrypi kubernetes k3s openclaw aigateway selfhosting homelab devops
Running OpenClaw on a Raspberry Pi Kubernetes Cluster
How I turned five Raspberry Pis into a self hosted AI gateway with automated backups, full observability, and Telegram alerts.
OpenClaw (originally Clawdbot, then briefly Moltbot) is a viral, open-source autonomous AI assistant designed to execute complex tasks on your behalf across various digital platforms. Created by developer Peter Steinberger, it gives you a unified interface to multiple LLM providers. Think of it as a single front door to Claude, GPT, Gemini, DeepSeek, and whatever model you fancy, all behind one API to provide Autonomous Execution. I wanted to self host it. More importantly, I wanted to learn Kubernetes the hard way. Not the “follow a tutorial on EKS” way. The “Use five Raspberry Pis, stack them on your desk, and figure it out” way.
I could have bought a Mac mini or spun up a VM on AWS, installed OpenClaw, and been done in an afternoon. But where’s the fun in that?
This is the story of how I went from bare metal to an AI gateway running on a K3s cluster, complete with persistent storage, Prometheus monitoring, Grafana dashboards, Telegram alerting, automated backups, and a custom Docker image that cut gateway startup time from three minutes to under ten seconds.
The Hardware
Five Raspberry Pi 4 boards, each with 8GB RAM (7.6 GiB usable),
running Debian Trixie 64 bit ARM64 with kernel
6.12.62+rpt-rpi-v8. They sit on my local network (I created
a VLAN for openClaw), each with a 128GB SD card. One USB drive attached
to the control plane node for backups. That’s the entire bill of
materials.
| Node | Role |
|---|---|
| pi-1 | Control plane, NFS server, USB backup drive |
| pi-2 | Worker: OpenClaw gateway + Redis |
| pi-3 | Worker: general workloads |
| pi-4 | Worker: general workloads |
| pi-5 | Worker: Prometheus, Grafana, Alertmanager |
Total cluster resources: 20 ARM64 cores, approximately 38 GiB usable RAM, around 500GB combined storage. More than enough for an AI gateway that proxies API calls rather than running inference locally. These little boards are not doing the thinking. They are directing traffic to models that do.
Step 1: Installing K3s
I went with K3s over full Kubernetes because it is purpose built for
ARM and resource constrained environments. The entire control plane
binary is under 100MB, and it ships with everything you need:
containerd, CoreDNS, Traefik
(which I replaced with nginx), and a local path provisioner for
storage.
Control Plane (pi-1)
curl -sfL https://get.k3s.io | sh -s - server \
--disable traefik \
--write-kubeconfig-mode 644
I disabled Traefik because I prefer nginx ingress for its
configurability. The kubeconfig lives at
/etc/rancher/k3s/k3s.yaml.
Worker Nodes (pi-2 through pi-5)
First, grab the node token from the control plane:
cat /var/lib/rancher/k3s/server/node-token
Then on each worker:
curl -sfL https://get.k3s.io | K3S_URL=https://<CONTROL_PLANE_IP>:6443 \
K3S_TOKEN=<NODE_TOKEN> sh -
Within a few minutes, kubectl get nodes shows all five
nodes ready, all running K3s v1.34.4+k3s1 with
containerd://2.1.5-k3s1. There is something deeply
satisfying about watching five nodes come online one after another.
Step 2: Labelling Nodes for Workload Pinning
I did not want workloads drifting around the cluster. OpenClaw uses persistent volumes bound to specific nodes, and I wanted monitoring isolated from application traffic. Two labels sorted this out:
kubectl label node pi-2 openclaw-role=gateway
kubectl label node pi-5 openclaw-role=monitoring
Every deployment uses nodeSelector to pin to the right
node. This keeps things predictable. When something breaks at 2am (it
will), you always know exactly where to look.
Step 3: Networking with MetalLB and Nginx Ingress
On a bare metal cluster, there is no cloud load balancer handing out external IPs. MetalLB fills that gap.
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.9/config/manifests/metallb-native.yaml
I allocated a small pool from my home network:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: default-pool
namespace: metallb-system
spec:
addresses:
- 192.168.1.230-192.168.1.240
The OpenClaw gateway got one IP, nginx ingress got another. On my
Mac, I pointed the gateway’s hostname at the ingress IP in
/etc/hosts, so I can reach the web UI from my browser. Low
tech, effective.
For TLS, cert-manager handles Let’s Encrypt certificates. The gateway serves plain HTTP, and nginx terminates TLS at the ingress layer.
Step 4: Storage — Keep It Simple
I initially deployed Longhorn for distributed, replicated storage across the cluster. Impressive technology. On Raspberry Pis, it was overkill. The replication overhead consumed significant CPU and memory on nodes that had better things to do, and debugging storage issues on ARM64 was painful enough to make me question my life choices.
I ripped it all out and switched to K3s’s built in
local-path provisioner. One PVC, one node, one directory on
the SD card. No replication, no distributed consensus, no iSCSI.
storageClassName: local-path
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
The tradeoff is obvious. If pi-2’s SD card dies, I lose the data on that node. But that is what backups are for, and I would rather have a cluster that runs smoothly 99.9% of the time than one that replicates storage at the cost of constant resource pressure.
OpenClaw uses three PVCs, all pinned to pi-2:
openclaw-config (5Gi): Mounted at
/root/.openclaw. Holds the gateway config, credentials,
agent profiles, and device pairing state.
openclaw-workspace (10Gi): Mounted at
/root/openclaw/workspace. The agent’s working
directory.
redis-data-redis-0 (2Gi): Redis persistence for session
state.
Hard lesson learned: never use emptyDir
for the .openclaw directory. I did this during early
testing. The pod restarted. All my workspace files, agent configuration,
and pairing state vanished. Unrecoverable. PVCs are non negotiable for
anything that matters.
Step 5: Building a Custom Gateway Image
Out of the box, OpenClaw runs on Node.js. You can
npm install -g openclaw@beta and start the gateway. But
doing that on every pod restart means a three minute startup time while
npm downloads packages over the Pi’s network connection. On a cluster
where pods restart for rolling updates, OOM kills, or node reboots, that
is a startup tax you will pay forever.
So I built a custom Docker image:
FROM node:22-bookworm
# Install system packages in a single layer
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-venv python3-pip jq ripgrep ffmpeg tmux rsync \
buildah net-tools iputils-ping dnsutils vim-tiny htop \
strace lsof procps iproute2 \
&& rm -rf /var/lib/apt/lists/*
# Install GitHub CLI
RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \
-o /usr/share/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=...] ..." \
> /etc/apt/sources.list.d/github-cli.list \
&& apt-get update && apt-get install -y --no-install-recommends gh \
&& rm -rf /var/lib/apt/lists/*
# Install uv (fast Python package manager)
RUN curl -LsSf https://astral.sh/uv/install.sh | sh
# Install kubectl (matching K3s version)
RUN curl -LO "https://dl.k8s.io/release/v1.34.4/bin/linux/arm64/kubectl" \
&& chmod +x kubectl && mv kubectl /usr/local/bin/
# Install helm
RUN curl -LO "https://get.helm.sh/helm-v3.17.3-linux-arm64.tar.gz" \
&& tar -xzf helm-v3.17.3-linux-arm64.tar.gz \
&& mv linux-arm64/helm /usr/local/bin/ && rm -rf linux-arm64 *.tar.gz
# Pre-install OpenClaw (eliminates 3-min npm install on every restart)
RUN npm install -g openclaw@beta 2>&1 | tail -3
EXPOSE 18789
ENTRYPOINT ["/bin/bash", "-c"]
A few key decisions worth explaining:
Base image:
node:22-bookworm. OpenClaw needs Node.js,
and Bookworm gives us a full Debian userland for the tools OpenClaw’s
agents use.
Pre installed tools. ripgrep,
jq, gh, ffmpeg,
tmux, uv, kubectl,
helm, and more. These unlock OpenClaw skills that require
system tools. We went from 8 eligible skills to 13.
ARM64 cross build. Built on a Mac with
docker buildx build --platform linux/arm64.
Chromium deliberately excluded. Saves approximately 500MB. Browser based skills can wait for a future version.
Hosted on GitHub Container Registry. Private repo, pulled with an image pull secret.
Startup time dropped from approximately three minutes to under ten seconds. That alone was worth the effort.
Step 6: The OpenClaw Deployment
Here is the heart of it. The Kubernetes deployment that runs the gateway. The entrypoint script is idempotent: it only creates config and credentials if they do not already exist on the PVC.
spec:
containers:
- name: openclaw
image: ghcr.io/<YOUR_ORG>/openclaw-gateway:v1.1.0
command: ["/bin/bash", "-c"]
args:
- |
set -e
echo "OpenClaw Gateway (custom image v1.1.0)"
openclaw --version
mkdir -p /root/.openclaw/credentials /root/.openclaw/devices
if [ ! -f /root/.openclaw/openclaw.json ]; then
echo "Creating initial config..."
# ... create default config
else
echo "Config already exists on persistent volume, keeping it."
fi
echo "Starting OpenClaw Gateway..."
exec openclaw gateway --port 18789 --bind lan
env:
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: openclaw-secrets
key: anthropic-api-key
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: provider-keys-v2
key: openai-api-key
- name: GEMINI_API_KEY
valueFrom:
secretKeyRef:
name: provider-keys-v2
key: gemini-api-key
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 3000m
memory: 3584Mi
volumeMounts:
- mountPath: /root/.openclaw
name: openclaw-config
- mountPath: /root/openclaw/workspace
name: openclaw-workspace
A few things worth noting:
exec openclaw gateway. The
exec replaces the shell process with the gateway, so the
container gets proper signal handling. No zombie processes, clean
shutdowns, correct health checks. Without exec, the gateway
runs as a child of bash and Kubernetes cannot send signals to it
properly.
API keys live in Kubernetes Secrets. Never hardcoded
in config files. OpenClaw supports ${ENV_VAR} syntax in its
config for referencing environment variables.
Resource limits are generous. The gateway gets up to 3 cores and 3.5GB of RAM on pi-2 (which has 4 cores and 7.6GB). AI agents can be memory hungry when processing long conversations.
RBAC gives the pod cluster admin. Yes, this is
deliberately permissive. The Pi cluster IS the sandbox. My Mac is the
security boundary. OpenClaw agents need to run kubectl,
helm, and other cluster operations, and I would rather
grant access explicitly than have agents fail silently.
A Gotcha: Gateway Bind vs Config Bind
This one took a while to figure out. The OpenClaw config says
gateway.bind: "loopback", but the entrypoint starts the
gateway with --bind lan. This looks like a bug, but it is
intentional.
The --bind lan flag makes the gateway listen on
0.0.0.0, so Kubernetes service traffic can reach it. But
the config’s loopback setting means CLI connections via
127.0.0.1 are treated as local. Loopback connections skip
device pairing entirely. This is how cron jobs and CLI commands work
inside the container without needing to pair a device first.
Step 7: Model Configuration and Cost Optimisation
OpenClaw supports multiple LLM providers. I configured a fallback chain that prioritises free and cheap models:
- Ollama Cloud / DeepSeek V3.1 (671B): Free tier, primary model
- Ollama Cloud / Qwen 3.5: Free tier, fallback
- Ollama Cloud / Devstral 2: Free tier, fallback
- Google / Gemini 2.0 Flash: Cheap, fast
- OpenAI / GPT-4.1 Mini: Last resort
The gateway also runs a heartbeat every 30 minutes during active
hours (04:30 to 23:00 London time) using
ollama-cloud/gemma3:12b, a free model that just checks the
system is alive.
For premium models like Claude Opus, I enabled prompt caching with
cacheRetention: "long" (one hour TTL). It doubles the write
cost but saves significantly on subsequent reads in multi turn
conversations. If you are having extended back and forth sessions, the
savings add up, fast.
Step 8: Monitoring with Prometheus and Grafana
Observability was non negotiable. I deployed the
kube-prometheus-stack Helm chart, but heavily customised
for Pi constraints:
KUBECONFIG=/etc/rancher/k3s/k3s.yaml helm install kube-prometheus-stack \
prometheus-community/kube-prometheus-stack \
-n monitoring --create-namespace \
-f monitoring-values.yaml
The values file is tuned for minimal resource usage:
Prometheus: Two day retention, ephemeral storage
(emptyDir, not PVC). If Prometheus restarts, I lose the
last two days of metrics. For a home cluster, that is a perfectly
acceptable tradeoff.
Grafana: No persistence, dashboards loaded via Helm values and sidecar ConfigMaps. If Grafana restarts, it rebuilds from config. Stateless by design.
Disabled false positive alerts: K3s bundles its
control plane components differently from standard Kubernetes, so
kubeControllerManager, kubeScheduler,
kubeProxy, and kubeEtcd monitors are all
disabled. Without this, you get a stream of noisy alerts about endpoints
that simply do not exist.
Everything is pinned to pi-5 via
nodeSelector: { openclaw-role: monitoring }.
Custom Alerts
I defined five alerts that cover the things I actually care about:
- alert: NodeDown
expr: up{job="node-exporter"} == 0
for: 2m
- alert: PodCrashLooping
expr: increase(kube_pod_container_status_restarts_total[15m]) > 3
for: 5m
- alert: HighCPUUsage
expr: >-
100 - (avg by(instance)
(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
- alert: HighMemoryUsage
expr: >-
(1 - (node_memory_MemAvailable_bytes
/ node_memory_MemTotal_bytes)) * 100 > 85
- alert: HighDiskUsage
expr: >-
(1 - (node_filesystem_avail_bytes{mountpoint="/"}
/ node_filesystem_size_bytes{mountpoint="/"})) * 100 > 80
Five alerts. Not fifty. Not five hundred. Five. Node down, pod crash looping, CPU hot, memory hot, disk filling up. If one of these fires, something genuinely needs attention. Everything else is noise.
Telegram Alerting
Alertmanager sends all critical and warning alerts to my Telegram via
a bot. I get a nicely formatted HTML message with the alert name,
namespace, severity, and description. The Watchdog and
InfoInhibitor alerts are routed to a null receiver so they
do not spam my phone.
receivers:
- name: 'telegram'
telegram_configs:
- bot_token_file: '/etc/alertmanager/secrets/telegram-bot-token/bot-token'
chat_id: <YOUR_CHAT_ID>
parse_mode: 'HTML'
Getting alerts on the same app I use to chat with the AI gateway is oddly satisfying. Everything in one place.
Step 9: Automated Backups
A cluster without backups is a cluster waiting to teach you a painful lesson. I wrote a backup script that runs daily at 02:00 via a systemd timer on pi-1:
#!/bin/bash
# k3s-backup.sh — Daily backup for Pi K3s cluster
set -euo pipefail
BACKUP_ROOT="/mnt/usb-backup"
DATE=$(date +%Y-%m-%d_%H%M)
RETAIN_DAYS=7
# 1. K3s state.db
sudo cp /var/lib/rancher/k3s/server/db/state.db \
"$BACKUP_ROOT/k3s/state-$DATE.db"
# 2. K8s manifests (all namespaces)
kubectl get deployments,statefulsets,daemonsets,services,configmaps,\
secrets,ingresses,persistentvolumeclaims -A -o yaml \
> "$BACKUP_ROOT/manifests/cluster-$DATE.yaml"
# 3. OpenClaw full data (tar from running pod)
POD=$(kubectl -n openclaw get pods -l app=openclaw-gateway \
-o jsonpath='{.items[0].metadata.name}')
kubectl -n openclaw exec "$POD" -- tar czf - -C /root .openclaw \
> "$BACKUP_ROOT/openclaw/full-$DATE/openclaw-data.tar.gz"
# 4. Prune backups older than 7 days
find "$BACKUP_ROOT" -type f -mtime +$RETAIN_DAYS -delete
It backs up three things: the K3s etcd equivalent state database, a
full YAML export of every resource in every namespace, and a tarball of
the entire .openclaw directory from the running gateway
pod.
Seven day retention keeps the USB drive from filling up. If I need to rebuild the cluster from scratch, I have everything I need.
Step 10: Housekeeping CronJobs
Two Kubernetes CronJobs keep things tidy:
Scratch cleanup (daily at 03:00): The gateway pod
has a scratch volume at /scratch for temporary files. A
busybox container prunes anything older than seven days.
Memory consolidation reminder (Mondays at 09:00): Sends me a Telegram message reminding me to review OpenClaw’s memory files and consolidate the week’s learnings. It is a small thing, but it keeps the agent’s context from growing unbounded. Left unchecked, memory files bloat and the agent’s performance degrades.
The Result
After all this work, what does the cluster look like?
28 pods across 5 nodes (down from approximately 75 before I removed Longhorn and local Ollama inference). Gateway startup in around 8 seconds, down from over three minutes. An always on AI gateway accessible from Telegram, WhatsApp, and a web UI. Full observability with Grafana dashboards and Telegram alerts on my phone. Daily automated backups to USB with seven day retention. A cost optimised model chain that defaults to free models and escalates only when needed. And 13 unlocked OpenClaw skills including GitHub, session logs, video frames, and tmux.
For a stack of five Raspberry Pis sitting on my desk, that is not bad at all.
Lessons Learned
Longhorn is amazing, but not for Pis. Distributed
storage on ARM single board computers with SD cards is a recipe for
frustration. local-path plus backups is the right answer
for a home cluster.
Never run openclaw doctor --fix
at startup. It destructively strips config values. My
entrypoint script learned this the hard way. Twice.
Pre bake your Docker image. Any
npm install or apt-get that runs on every pod
start is a startup tax you will pay forever. Bake it into the image.
exec your entrypoint. Without
exec, the gateway runs as a child of bash. Kubernetes
cannot send signals to it properly, health checks do not work, and you
get zombie processes on shutdown.
Persistent volumes are sacred. The moment you think “emptyDir is fine for now,” you are one restart away from losing data that matters. If it matters, give it a PVC.
Pin workloads to nodes. On a small cluster,
nodeSelector is your best friend. You always know where to
look, and you avoid resource contention between unrelated workloads.
The Pi cluster IS the sandbox. I gave the gateway
pod cluster-admin and a privileged security context. On a
cloud cluster this would be reckless. On a home lab where my Mac is the
security boundary and I have daily backups, it is pragmatic.
Telegram is the best ops channel. Full stop.
The whole setup lives in a single directory of YAML files, a
Dockerfile, and a backup script. No Terraform, no Pulumi, no GitOps
controller. Just kubectl apply and
helm install. For a home lab running an AI gateway, that is
exactly the right level of complexity.
If you are thinking about self hosting OpenClaw, or any AI gateway,
on Raspberry Pis, I hope this gives you a head start. The Pis are more
than capable. The real work is in the decisions: what to simplify, what
to automate, and what to just delete.
If you have Raspberry Pis to hand, give it a go! Enjoy!