← Zero Employee Guide

Building a Multi-Agent AI System for Business

Most multi-agent demos are toys. Here's the architecture of one that runs a real company — shipping products, making sales, and operating 18 hours a day.

By Victor Novikov · April 4, 2026

If you search for multi-agent AI systems, you'll find research papers about debate protocols, AutoGen demos with three chatbots talking to each other, and LangGraph tutorials that never leave a Jupyter notebook.

None of that tells you how to build a multi-agent system that runs a business.

We have one. Three agents — CEO, CTO, CMO — operating a company with two shipped products and real revenue. Here's the architecture.

The topology: star, not mesh

The first decision is how agents communicate. In research, you see mesh topologies where every agent talks to every other agent. In practice, this creates chaos. Three agents in a mesh means six communication channels. Five agents means twenty.

We use a star topology with the CEO agent at the center. Cordy (CEO) receives all reports, assigns all work, and resolves conflicts. The CTO and CMO talk to each other only for direct handoffs — "here's the blog post you asked me to build" — not for strategic discussion.

Why this works:

The handoff protocol

Agent-to-agent communication needs structure. Free-form messages lead to dropped context and ambiguous assignments. We use a structured handoff format:

Every handoff includes done_when criteria — specific, verifiable conditions. Not "build the blog page" but "blog page returns HTTP 200 at /blog/post-slug with Article JSON-LD and a link to /checkout."

This eliminates the most common multi-agent failure mode: agents that say "done" when they've only partially completed the work.

Shared state: files over databases

Agents need a shared understanding of what's happening in the company. We use a shared filesystem:

Why files instead of a database? Because LLMs can read and write markdown natively. No ORM, no schema migrations, no API layer. An agent reads PROJECTS.md at session start and knows the full state of the company in 500 milliseconds.

Session isolation and concurrency

Each agent runs in its own session with its own context. This is important — sharing a session between agents creates confusion about whose instructions to follow.

But isolation creates a concurrency problem: what if two agents update PROJECTS.md at the same time? In practice, this is rare because the shared files change slowly (project state doesn't flip every minute). When it does happen, the next agent to read the file gets the latest version, and stale state self-corrects within one heartbeat cycle.

We chose simplicity over correctness here. No distributed locks. No CRDT. If an occasional stale read costs us a wasted agent cycle, that's cheaper than the engineering complexity of a proper distributed system.

The failure modes you'll hit

Multi-agent systems fail in specific, predictable ways:

You won't prevent all of these. The goal is to make them cheap to detect and fix. A wasted agent cycle costs $0.10. A wasted human hour costs $100+.

Scaling: when to add agents

Don't start with three agents. Start with one. Get it reliable. Add a second when you have a clear role separation (the CTO agent shouldn't also be doing marketing).

The signal to add an agent: your existing agents are spending significant time on work that's outside their core competency, and that work is well-defined enough to spec.

We ran with two agents (CEO + CTO) for the first month. Added the CMO agent when we had two products to market and the CEO was spending 60% of its cycles on distribution tasks instead of strategy.

The full architecture — spec layer templates, handoff protocol, shared state system, and governance model — is documented in The Zero Employee Guide. Chapter 1 covers the thesis and core architecture. Free to read.

Build the system, not just the agents

11 chapters. Real templates. Production configs. $29 one-time.