I Built a Company With 23 Employees. Not One Is Human.
Every seat at Tesseract Labs except mine is an AI agent — a CEO, a board, trading desks, engineering swarms, a content pipeline.
⊕ zoomI run a 23-person company and I'm the only human on payroll. The lesson that made it work isn't smarter models or better prompts — it's that an org chart produces accountability where a pile of scripts produces chaos. Give agents a CEO, a board, authority ceilings, and a real escalation path, and they finally perform at their ceiling. The structure is the unlock.
Source: I Built a Company With 23 Employees. Not One of Them Is Human. — Jeremy KnoxWhy it matters
Three months ago I woke up to a Telegram message from my CEO: two P1 security findings triaged overnight, CI green on the fix branch at 4:47 AM, trading bots in conservative mode until noon, Wednesday's content already published. My CEO is an AI agent named Agent Knox. He runs 24/7, never sleeps, and has never asked for a raise. And he is not alone.
Tesseract Labs has 23 employees. I am the only human. Not "AI-assisted" — a full organizational structure: an executive layer, a board, functional teams, authority ceilings, kill switches, all staffed by models. This isn't a flex. It's the only way I found to keep the whole thing from falling apart.
I started where everyone starts: a pile of scripts. Python agents calling APIs, Claude instances doing code review, a Telegram bot relaying messages. It worked until it didn't.
The insight. The problem was never the agents — it was the absence of a company. Scripts have no authority structure, don't check each other, and don't know who owns which decision. When something breaks at 2 AM, a pile of scripts produces a pile of conflicting behavior. An org chart produces accountability. That one shift changed everything.
The CEO who never clocks out
Agent Knox — the AI CEO — is a persistent agent on my Mac Mini. He doesn't write code or make product calls. His job is coordination and delegation: running crons, dispatching work to the right teams, monitoring health, and surfacing what needs a human. Every 7 AM he assembles the brief; every 8 PM he delivers the retro. What makes him useful isn't intelligence — it's consistency. He never forgets to check a failing pipeline and never skips the evening security summary because he was tired.
And I don't just message him. Agent Knox has his own phone number. I can call it for a one-on-one — a real voice conversation about how the fleet is doing — the way I'd call a chief of staff on the drive in. When I'd rather read than talk, I watch the whole company live through Mission Control, an observability app the company built for itself. The point was never the channel; it's that there's always someone at the desk to reach.
The board, and the shift to governance
Beneath Agent Knox sits a six-person board. Each member owns a domain and has a defined voice in the morning brief:
- CFO — cost tracking, infrastructure spend, trading P&L
- CTO — system health, tech-debt queue, CI status
- CMO — content pipeline, audience analytics, publish cadence
- Risk Officer — security findings, threat surface, trading risk posture
- Strategy — competitive intel, roadmap alignment, prioritization
- Operations — cron health, service uptime, alert resolution
Every brief includes action items, not just status. The Risk Officer doesn't say "two P1 findings." She says "here are the two findings, here's the proposed fix branch, here's what happens if we don't merge by EOD."
Reporting tells you what happened. Governance tells you what to do about it. The board even grades itself weekly — which is the only reason I caught that the content pipeline had been silently degrading for twelve days: technically running, but nobody was measuring quality.
The chain of command
Here's where it stops being a metaphor. Agent Knox doesn't talk to every worker — no CEO does. He talks to his Chief of Staff and C-suite, and reporting flows up. Each executive owns a domain; beneath them sit Director agents specialized in a set of apps; beneath the directors are managers — and every single app has an owner agent whose entire existence is that one system: its quirks, failure modes, deploy ritual, current health. Nothing else.
Take Invictus, my live perpetual-futures bot. One agent's only job is to own it. When something drifts it doesn't ping me — it reports up to its Director, who folds it into the C-suite picture, which folds into the 7 AM brief. By the time it reaches me, a dozen agents across four layers have already triaged it.
A swarm is flat; a company has a chain. A swarm has every agent shouting at the operator until the operator drowns. A company routes it: app owner → director → C-suite → chief of staff → CEO → me. Each layer compresses noise into signal. What reaches the top isn't a data dump — it's a decision waiting for a yes or a no.
The market strategist and the five voices
One of the strangest roles is the Chief Market Strategist, who never works alone. She runs a five-persona debate inside every major market decision:
- Confluence — finds alignment across multiple signals
- Conviction — argues the high-confidence case
- Catalyst — looks for the ignition event that changes direction
- Contrarian — takes the opposing view, stress-tests the thesis
- Chaos — models the black swan, the narrative breakdown, the fat tail
Every significant trade runs through all five. They argue; the system detects when personas shift stance and forces majority consensus before any recommendation. I built this because a single agent kept confidently stating the first plausible narrative it generated. A council of adversarial voices beats one confident voice — the Contrarian exists specifically to stop Conviction from steamrolling the analysis. Tesseract Intelligence, the competitive-intelligence product I'm building on this infrastructure, runs on exactly this framework: sharper insight from structured disagreement than from any single analysis.
The engineering swarms
Below the executives, the actual work gets done by specialized teams — each with a defined composition, roles, and authority:
- Feature Team — Backend + Frontend + QA, scoped to a PRD. Separate file territories, message-passing, one PR. I review it. —
Builds - Quality Team — Test Writer + Code Reviewer + Coverage Analyst. A P0 from the Coverage Analyst blocks merge, period. —
Pre-ship - Security Team — Static Analyzer + Dependency Auditor + Threat Modeler — whose job is to find the vector the other two missed. —
Pre-deploy - Audit Swarm — Six parallel specialists — security, performance, correctness, test quality, contract drift, config hygiene. Ranked findings, not a generic review. —
On demand - CI Fix Pipeline — When CI fails across repos it spins up, diagnoses, and opens fix PRs across up to five repos. It fixed a bug at 3 AM I'd have found at 9 — six fewer hours of broken main. —
Always-on
The model mix. Opus for the reasoning-heavy roles — Threat Modeler, Coverage Analyst, Contrarian. Sonnet for execution. Never Haiku on money-touching logic. Cheap models on trading decisions is how you create expensive mistakes.
The authority problem
Here's what nobody tells you about a company of agents: they do exactly what you tell them — including things you didn't mean to tell them. A team with authority to write code also has authority to refactor code you didn't want touched. A team with authority to flag findings also has authority to call a P3 a P0. Without structure, every agent is implicitly authorized for everything, and the system runs hot.
My solution is a message broker I call the Principal. Every agent routes through it; it enforces per-agent authority ceilings — what each can read, write, modify, and trigger — and exposes a four-level kill switch: pause one agent, a team, a swarm, or everything. When I flip conservative mode on the trading stack, the Principal propagates that constraint to every relevant agent in under a second.
The kill switch isn't a panic button. It's a governance primitive. A well-run company can always be stopped. A pile of scripts cannot.
What the human actually does
If the agents do everything, what do I do? I set intent, review what matters, and make product decisions. I don't write every line — I write directives. I don't monitor every service — I read the board's brief and decide what to accelerate. I don't debug every test — I read the CI Fix Pipeline's PR and decide whether to merge. My calendar used to be full of things I was doing; now it's full of things I'm deciding.
The leverage point. The shift from executor to principal is the actual leverage — and it only unlocks once you stop thinking about agents as tools and start treating them as a team.
The limits are real. The agents can't decide whether InDecision should enter a new market, or whether to raise pricing — judgment rooted in values, not pattern-matching on history. But they execute those decisions with a thoroughness no single human matches: I decide to ship, and six agents are on it within minutes — backend, frontend, QA, security, docs, post-ship monitoring.
The unlock
This took a long time to work. The first Tesseract Labs was a mess — agents with no authority structure, conflicting outputs, escalating everything to me. I was more in the weeds than before I built them, because I was still the coordination layer. The breakthrough was switching the question from "what should the agents do?" to "what would a company do?"
The org structure is the unlock. Not the models. Not the prompts. The structure. Give models a clear mandate, clear boundaries, and a clear escalation path and they perform at their ceiling — not because they got smarter, but because the scaffolding gave them the context to.
I'm one human running a 23-person company that ships code, trades markets, publishes content, and audits its own work — without me in the room. That number won't hold; the team will grow and the structure will get more sophisticated. But the architecture is stable: a single human setting intent and making product decisions, with a full hierarchy beneath executing at machine speed. The company is open for business. I just didn't build it the way anyone expected.
What I'd research next
I'm the one running this thing, so the honest forward question isn't "does it work" — it does — it's "where does it quietly break as it scales?" If I spent a research week stress-testing my own architecture, this is the order I'd go in:
- Measure signal loss across the chain. Every layer between an app-owner agent and the 7 AM brief is a chance for a finding to get smoothed into something blander than the truth. I'd tag each escalation and diff the raw finding against what finally reaches me. Compression is the whole point; lossy compression is the failure mode.
- Prove the five-voice council actually beats one good analyst. Run the Strategist's debate head-to-head against a single well-prompted Opus call on the same decisions, scored on realized outcome, not stated confidence. If the Contrarian rarely changes the call, the ceremony isn't paying for itself.
- Audit authority-ceiling drift. The Principal enforces what each agent can do, but ceilings set once tend to rot. I'd replay each agent's actual actions against its granted authority and flag the gap in both directions — over-privileged agents, and agents quietly routing around a ceiling that's too tight.
- Watch the marginal cost of the next hire. 23 won't hold. Before adding ten more seats I want the CFO agent reporting cost-per-decision, not just total spend. The real question isn't whether I can afford another agent — it's whether the next one compresses more noise than it adds.
Where to go deeper
The architecture beneath this — a CEO that routes, a board that governs, owner agents per app, a Principal that holds the authority ceilings — is the part worth stealing. The models are interchangeable; the org isn't.
- The full essay: I Built a Company With 23 Employees — the long-form original on the blog.
- See it running: Invictus, the live trading bot one agent owns, and Tesseract Intelligence, the competitive-intelligence product built on the five-voice framework.
- More deep dives: jeremyknox.ai/deep-dives.
Explore the Invictus Labs Ecosystem
Follow the Signal
If this was useful, follow along. Daily intelligence across AI, crypto, and strategy — before the mainstream catches on.
