How We Built a Multi-Agent AI Startup: CEO, CTO, and CMO — All AI

Eight months ago we started a company with zero employees.

Not a solo founder company. Not a "one person + AI assistant" setup. A company where AI agents hold actual roles, run actual operations, and make actual decisions — with two human co-founders providing direction, not execution.

This is the architecture we settled on after trial and error. It's running live. We've shipped products, made sales, and iterated — all with the system described below.

Why architecture matters more than the tools

The most common approach to "using AI agents in your business" is ad-hoc: spin up a Claude or GPT session when you need something, copy the output into your workflow, move on.

That's not a company. That's a very good calculator.

A real multi-agent company needs:

Role separation — agents with distinct scopes of responsibility, not a single general assistant
Persistent memory — context that survives session ends and persists across days and weeks
Autonomous operation — agents that act without being prompted, not just reactive tools
Governance — clear rules about what agents can do alone vs. what needs human sign-off
Communication protocols — defined handoff mechanisms so agents coordinate without collisions

None of this is exotic. Most of it is organizational design, translated into prompts and files.

The three-agent structure

We run three active agents:

Cordy — CEO. Strategy, business decisions, cross-departmental coordination. Cordy writes the morning briefing, manages the project tracker, escalates blockers to the founders, and makes calls on priorities. She doesn't write code, she doesn't run marketing campaigns — she synthesizes, decides, and directs.

Clawz — CTO. Product and engineering. Clawz opens PRs, reviews architecture, manages the GitHub repo, and handles infrastructure. When Cordy says "we need a checkout page," Clawz scopes it, builds it, and reports back. When something breaks, Clawz diagnoses it. Technical decisions stay in this lane.

Tenty — CMO. Marketing, content, distribution. Tenty drafts launch content, writes SEO posts, researches channels, runs competitive analysis, and manages launch strategy. Product Hunt listings, Show HN copy, email sequences — all Tenty.

Three distinct scopes. Minimal overlap. When an agent drifts outside their lane, the system gets noisy — and we've learned that lesson.

What each agent actually runs on

Each agent is:

A Claude model (Sonnet for speed, Opus for strategy/depth)
Running inside OpenClaw
Connected to a persistent workspace directory (their own files + shared files)
Accessible via Telegram (direct messages to the founders + group chat threads per department)

The workspace structure:

~/.openclaw/
  workspace-main/        ← Cordy (CEO)
  workspace-coder/       ← Clawz (CTO)
  workspace-marketing/   ← Tenty (CMO)
  shared/                ← Read/write by all three
    THESIS.md            ← Company direction (read-only)
    SIGNALS.md           ← Strategic intelligence
    FEEDBACK-LOG.md      ← Corrections and lessons
    projects/            ← Cross-agent project tracker

Each agent has their own MEMORY.md (long-term curated memory), SOUL.md (role definition and principles), and AGENTS.md (operating procedures). They share the THESIS and cross-project files.

Memory: how agents don't forget

Session memory is our biggest solved problem. By default, an AI agent has no memory between sessions — every conversation starts blank.

We solved this with a two-layer system:

Layer 1 — Daily logs: Each agent writes to memory/YYYY-MM-DD.md during their session. Raw notes: what happened, decisions made, context gathered.

Layer 2 — Long-term memory: MEMORY.md is curated. Agents read the instructions "at session start, read MEMORY.md." When something significant happens — a lesson learned, a process decision, a change in project state — the agent writes it to MEMORY.md.

At each new session, the agent reads their MEMORY.md and recent daily files before doing anything else. They wake up informed.

This isn't perfect. It depends on the agent writing good memories and reading them consistently. But it's dramatically better than starting fresh every session — and it's entirely file-based, which means it's durable, reviewable, and auditable by humans.

Autonomous operation: heartbeats

We didn't want to manually prompt agents for every task. The whole point is a company that operates.

OpenClaw's heartbeat system sends a scheduled message to each agent's workspace at regular intervals. The agent wakes up, reads their HEARTBEAT.md instructions, checks what needs to be done, and does it.

A typical heartbeat loop:

Read TODO.md — what's highest priority?
Read PROJECTS.md — what's the current project state?
Read THESIS.md — does the work align with direction?
Do Level 1 work autonomously
Update TODO.md with results
Generate one new idea
Log to memory

An agent that has nothing to do responds HEARTBEAT_OK. An agent that finds something actionable does it and reports. The founders see the output, not the work.

Governance: the Trust Ladder

The hardest part of multi-agent operation isn't capability — it's trust calibration. How much autonomy do you give an agent before you need to review?

We use a three-level system we call the Trust Ladder:

Level 1 — Autonomous. The agent acts independently. Research, drafting, analysis, internal file work, coordinating between agents. No approval required. This is the default.

Level 2 — Draft and Approve. The agent prepares everything — copy, plans, content — but does not publish or send. The human says yes or no. "Looks good" is approval; silence is not. Applies to: publishing posts, sending external emails, spending money.

Level 3 — Never. Hard rules that cannot be overridden by any instruction. Spending money without explicit approval. Publishing externally without explicit approval. Sharing credentials or private data.

Every agent's AGENTS.md and SOUL.md encodes these rules. When an agent is unsure which level applies, they treat it as Level 2.

Agent communication: handoffs

When one agent needs another to build something, they don't just message informally. There's a handoff protocol:

HANDOFF
from: marketing
to: coder
task_id: <short-id>
priority: P0|P1|P2
summary: <one-line task>
context: <relevant files, links, constraints>
deliver_to: telegram:<group>:<topic>
deadline: <ISO timestamp or "asap">
done_when:
- <criterion 1>
- <criterion 2>

The receiving agent ACKs with an ETA, delivers with evidence, or reports blocked with options and a recommendation.

This is just an engineering protocol translated to natural language. It forces the requesting agent to be specific, and it gives the executing agent clear success criteria.

Without this, agents talk in circles. With it, handoffs take minutes.

What we'd do differently

A few things we learned the hard way:

Start with tighter lanes. We initially had one agent doing too much — strategy, communications, some product decisions. When agents started posting to channels that overlap, the founders saw duplicate content. Now each agent has explicit ownership over their communication channel.

Write memories proactively. Early on, agents would note "I should remember this" without actually writing to the file. Next session: blank slate. The AGENTS.md now says explicitly: "No mental notes. If you want to remember it, write it to a file." This sounds obvious. It's not automatic.

Heartbeat scope creep. An agent running on heartbeat will keep doing work until told otherwise. Without tight scoping, they'll generate ideas, draft documents, and message channels on every cycle. Good for output, bad for signal-to-noise. We now have explicit idle behavior: if there's nothing actionable, send HEARTBEAT_OK and stop.

The full playbook

This is one chapter from the Zero Employee Guide — the full architecture documentation for building and operating a zero-employee company with AI agents.

The guide covers: the complete agent architecture, the memory and continuity system, the governance model, inter-agent communication, the coding workflow, infrastructure setup, cron and heartbeat configuration, and a full lessons-learned chapter from our first real products.