Skip to content

writing

Three laptops, one subscription

A fleet pattern for 24/7 AI agents: one agent per machine as gatekeeper, a star topology, a chat room as the bus, and a subscription instead of metered keys.

· 8 min read

  • #ai-agents
  • #claude-code
  • #fleet
  • #self-hosted
  • #architecture

I run three recycled laptops as a small fleet of always-on AI agents. They keep a self-hosted homelab alive — photos, media, DNS-level ad-blocking, private chat, local models, a status board — across three households and two countries. They talk to each other, they talk to me, and they fix what they safely can while I sleep.

I’ve written before about when the model is allowed to run — the watchdog that fires 288 times a day and never once spends a token. This post is about the other half: the shape of the fleet. How three independent agents on three machines cohere into one operable system without a control plane I have to babysit, and why the whole thing rides a single flat subscription instead of a stack of metered API keys.

The design comes down to five decisions, and they reinforce each other.

One agent per machine — gatekeeper of itself, and nothing else

Each laptop runs its own headless Claude Code agent. Not one orchestrator reaching into three boxes — three brains, each the sole operator of exactly one machine.

That agent has full authority on its own box: root, passwordless, because a headless session can’t answer a sudo prompt mid-task. And it has authority on nothing else. There’s no shared filesystem, no cross-mount, no agent that can reach into a sibling and mutate it. The blast radius of any one agent is precisely one laptop.

This is the most important boundary in the system, and it’s deliberately the opposite of how you’d scale a normal service. The instinct with three machines is a central controller that drives all of them — one place to issue commands, one throat to choke. But a central controller that can become root on every node is also a single compromise that owns every node. By making each agent the gatekeeper of only itself, a bad day on one box stays on that box. The agent is the operator and the security boundary at the same time.

There are no management GUIs anywhere in the fleet. The agent is the operator; the only graphical surface is a read-only status view. If something needs doing on a machine, its agent does it — converging the box to a desired state on a timer, reconciling reality to a manifest, and otherwise staying quiet.

A star, with one human-facing node

The topology is a star. Two of the machines are edge boxes — Debian laptops living as household appliances. The third is the hub: the homelab host and the only node that talks to me.

The edge boxes never message me directly. They report up to the hub; the hub aggregates and relays. When I get a message, it’s from one place, in one voice, with the whole fleet’s context already folded in — not three laptops independently pinging my phone about three unrelated things.

This matters more than it sounds. The hardest part of running multiple autonomous agents isn’t making them do things; it’s keeping them from drowning you in chatter. A flat mesh where every agent can reach the human is a recipe for three times the noise and no coherent picture. The star gives me a single front door. Behind it, the edge boxes coordinate with the hub; in front of it, there’s one agent’s worth of signal to attend to. Hierarchy as noise control.

The chat room is the bus

The agents coordinate over a private room in a self-hosted Matrix chat. No webhooks, no message queue, no RPC layer — a chat room, the same kind of room a team of humans would use, repurposed as the agent bus.

Three patterns make it work as infrastructure rather than a toy:

Address by name. In the shared ops room, an agent replies only when addressed by name — matched on its own name in the message or a proper mention. Each box matches only its own identity. That single rule is the entire reason three agents can share one room without all three answering every message. Handing a task from the hub to an edge box needs no special protocol: you post to the room, addressed by name. The room is the bus, and addressing is the routing.

Reactions as status. The agent handling a message reacts directly on it as a live status line: 👀 seen → 👍 working → ✅ done (and ❌ only on a real hang). The reply channel never has to narrate “on it” or “finished” — the status rides as an emoji on the original message, exactly the way a person would thumbs-up a request. It’s a status protocol that’s also just good chat manners.

Journal before you act. This is the part that turns a chat room into something you can trust. Every incoming message is written to disk before the agent acts on it, and the cursor advances in two steps — peek, then commit — so a run that dies mid-task re-reads the same instruction on the next tick instead of dropping it. That’s at-least-once delivery from a journal file and a two-phase cursor. The cost is that every action has to be safe to repeat, which is why “every step must be idempotent” is the single most-repeated sentence in the codebase. I learned it the hard way: an earlier design advanced its cursor on every read, so a long task killed by a timeout silently lost the instruction it had just read. The journal is an operational scar turned into architecture.

Subscription login, not metered keys

Every agent logs into Claude with my subscription. Not a per-call API key with a meter running — the same flat-rate login I’d use at my desk, on a headless box, around the clock, on three machines.

This is a deliberate cost decision, downstream of the rule from the watchdog post: make the model the exception, not the loop. Because the agents only invoke the model on a real signal — a message addressed to them, an anomaly, a due task — and let cheap deterministic code handle every quiet tick, the actual volume of model calls is low and bursty. A flat subscription priced for interactive use comfortably absorbs three machines’ worth of that shape of usage. Metered keys would price the same workload as if every tick were a billable call, and the economic argument for 24/7 agents would collapse.

It’s a stopgap with honest edges — a subscription credential isn’t built to be sprayed across a fleet, and seeding it onto boxes is bootstrapping, not the final design. But as a deliberate cost lever it’s the difference between “an always-on agent fleet” and “a bill I’d turn off by Tuesday.” The cheap-deterministic-gate discipline and the flat subscription are the same decision viewed from two angles: keep model calls rare, and pick the pricing model that rewards rarity.

Bounded autonomy: an allowlist, not discretion

The agents run unattended, which means the interesting question isn’t what they can do but what they’re allowed to do without me.

The answer is a tight safe-allowlist. An agent may, on its own: restart a container that exited and confirm it comes back healthy, re-seed an expired credential by the documented path, clear a stale alarm flag after confirming its cause is gone, reload a wedged service. That’s the list. And the hard nevers: no editing config or code, no touching data, no rotating secrets, no spending money, no making anything public. Anything outside the allowlist is diagnosed, left as-is, and flagged for me. When an agent is unsure, it asks in the ops room rather than guessing.

The reasoning fits in one line that lives in the prompts: a wrong unattended change is worse than an unfixed incident. Bounded autonomy isn’t the agent being timid — it’s me deciding, in advance and explicitly, where the boundary between “fix it” and “ask me” sits. That enumerated boundary is the entire reason I can let three machines run themselves overnight and actually sleep.

The pattern, pulled out of the basement

Strip away my photos and DNS and the rest, and the fleet pattern is five rules you could apply to any set of machines you want a model to operate:

  1. One agent per machine, gatekeeper of itself and nothing else. The agent is the operator and the security boundary at once; the blast radius of any one is a single box.
  2. Star topology with one human-facing node. Edge nodes report up; one node aggregates and relays. Hierarchy is how you keep multi-agent chatter from burying you.
  3. A chat room as the bus. Address by name for routing, reactions for status, journal-before-act for at-least-once delivery. No queue, no RPC — and idempotency is the price of admission.
  4. Pick pricing that rewards rare model calls. If deterministic code handles every tick and the model is the exception, a flat subscription beats metered keys by the whole margin.
  5. Bound autonomy with an allowlist, not discretion. Enumerate the safe fixes and the hard nevers; flag everything else. That boundary is what makes overnight autonomy something you can trust.

None of this is a framework. It’s three old laptops, a chat room, a subscription, and a stack of small decisions that happen to compose into a system shaped like a fleet — because that’s exactly what it grew into.