Case studies — each one walks through the problem, the architecture, the decisions, the
numbers, and the lessons.
▣ live · 3 nodes · 22 containers
Three recycled laptops, each operated by its own headless Claude Code agent: a private 22-container homelab that monitors, heals, and reports on itself.
- 288 watchdog runs/day, zero tokens
- 4.18s → 18ms status query
- 22 containers
▣ npm · @hapus/mcp-cache · ★9
A transparent proxy that caches oversized MCP tool responses and hands the model query tools — so any MCP server works past the 25K-token wall.
- 25K → unlimited token wall
- −30–50% LLM API cost
- <200ms cached query
▣ production · HIPAA · 4 yrs
Production agentic RAG over docs, code, Confluence, and Jira for a HIPAA/ISO 13485 platform — compliance retrieval 30s → sub-second, verification 60% faster.
- 30s → <1s compliance retrieval
- 60% faster verification
▣ production · gcp · agentic
Backend, infra, and AI for a consumer health startup: four FastAPI services on Cloud Run, a streaming Claude agent, and a nightly autonomous bug-fix agent.
- 4 on Cloud Run services
- ≤3/night, autonomous bug-fix PRs
▣ open source · python · mcp
Multi-account email MCP server for agent fleets — scoped per-agent keys, owner-approved sends, and a policy layer that assumes the agent is compromised.
- 8 email tools
- 180s owner approval window
▣ fintech · 200 TPS target · aws
Real-time VPA validation for a payments platform serving India — in-memory TTL caching, pool tuning, and idempotent writes, load-validated for 200 TPS peak.
- 200 TPS peak target
- ~800 req/s downstream
▣ study · 1 vCPU · k6
A staged k6 study of one FastAPI service on a fixed 1 vCPU budget: 1.68 → 90 RPS, where nearly every win was the concurrency model, not hardware.
- 1.68 → 69.6 RPS sync → async
- 90 RPS best on 1 vCPU