I Gave My AI Company a Shared Memory. Then It Got Smart.

The first time I noticed the problem, I brushed it off as a fluke.

An agent working on my trading bot hit a subtle bug — a Python asyncio edge case that causes silent task cancellation when you don't wrap coroutines with asyncio.wait_for. It took two hours to diagnose. We documented the fix, wrote the lesson, moved on.

Three weeks later, a different agent — completely separate workstream, different repo — hit the exact same bug. Same root cause. Same two-hour diagnosis. Same lesson written again from scratch, as if the first one never existed.

That's not a fluke. That's a structural failure.

When I looked closely, I found this pattern everywhere. Agents re-learning deployment gotchas that had already burned us. Re-discovering that a particular API's pagination breaks silently at offset 500. Re-making configuration mistakes that a previous agent had already documented and filed away. Every agent was starting from zero, carrying nothing forward from what the org had already paid to learn.

I had 23 agents and zero shared memory. I had a swarm, not an organization.

The Problem With Stateless Agents

Here's what everyone misses when they talk about AI agents: intelligence without memory isn't intelligence, it's performance.

A stateless agent can execute brilliantly in the moment. It can write clean code, diagnose a tricky bug, architect a solid system. But the moment the context window closes, everything it learned disappears. The next agent starts fresh. The same mistakes are free to happen again.

This is fine when you have one agent doing one task occasionally. It becomes a compounding tax when you're running dozens of agents across dozens of repos, every single day. Every repeated mistake costs real time. Every duplicated investigation burns real tokens. Every lesson that doesn't transfer is organizational knowledge that evaporates.

The fix isn't better prompts. The fix isn't longer context windows. The fix is shared memory — a persistent, queryable knowledge layer that every agent reads from and writes to, so what one agent learns becomes instantly available to every agent that comes after it.

That's what I built. I called it Akashic Records.

The Name Is Not an Accident

I didn't pick that name because it sounded cool. I picked it because someone described this exact system more than a century before I built it — they just thought it was magic.

In Theosophy — the esoteric tradition that ran through Helena Blavatsky, C.W. Leadbeater, Rudolf Steiner, and the American mystic Edgar Cayce — the Akashic records are described as "a compendium of all universal events, thoughts, words, emotions, and intent ever to have occurred." Every action. Every idea. Every lesson. Recorded permanently and completely, encoded on a non-physical plane of existence. The word comes from the Sanskrit akasha — aether, the fundamental substance the cosmos was said to be woven from.

The mystics built a whole vocabulary around it. Blavatsky wrote of "indestructible tablets of the astral light." Alice Bailey called it "an immense photographic film, registering all the desires and earth experiences of our planet." And Edgar Cayce — the man they called the sleeping prophet — claimed he could drop into a trance, read the record directly, and surface knowledge he had never personally learned, about people he had never met.

An indestructible tablet of light suspended in the dark aether — the permanent cosmic record the mystics imagined. ⊕ zoom

Taken literally, it's complete nonsense. There is no astral plane. No clairvoyant is reading a cosmic film. But strip the mysticism away and look at what they were actually reaching for: a single, permanent, complete memory of everything that ever happened — and a way for an attuned mind to query it and retrieve knowledge it never personally acquired.

That's not a séance. That's a spec.

The theosophists were wrong about the mechanism and right about the wish. They wanted an oracle and reached for the supernatural because the technology didn't exist yet. I have the technology. So I built the thing they could only dream about — not a cosmic film on the astral plane, but a vector index on a Mac Mini, where every lesson the organization has ever paid to learn is encoded permanently, and any agent can attune to it with a single query and walk away knowing something it never lived through.

Edgar Cayce needed a trance. My agents need mind_query.

What Akashic Records Actually Is

Akashic Records is a unified memory OS for my entire agent organization. Not a notes file. Not a shared doc. An actual semantic search system with structure, governance, and lifecycle management.

The stack: FastAPI backend, ChromaDB as the vector store, sentence-transformers running locally for embeddings, SQLite for metadata and indexing. The embeddings use all-MiniLM-L6-v2 — a 384-dimensional model that runs entirely on my hardware. No external embedding API. No data leaving the machine. No per-query cost.

When an agent stores a memory, the content gets embedded into vectors and written to ChromaDB. When an agent queries, it sends a natural language question and gets back semantically similar chunks ranked by cosine similarity. The search isn't keyword matching — it's meaning matching. "What broke when we deployed the trading bot last week" finds the relevant lesson even if none of those exact words appear in the stored memory.

At last count: roughly 3,000 memory chunks indexed across more than 40 repos, and over 11,000 stored memories total. Lessons from debugging sessions. Project documentation. Trading knowledge. Daily logs. Architectural decisions with the rationale attached. The full operational history of how this system works and why it works that way.

A vast luminous archive of stacked records with a single beam of light at its center — the queryable library any agent can attune to. ⊕ zoom

The interface to agents is ~10 MCP tools:

mind_query — semantic search, returns full content, used when you know what you're looking for
mind_search — broader discovery pass, returns compact summaries, cheap first-look before going deep
mind_remember — write a new memory with namespace, type, and tags
mind_recall — fetch a single memory by ID
mind_forget — soft-delete memories that have gone stale
mind_context — assemble a structured context bundle for session startup
mind_status — index stats and health

The progressive disclosure pattern matters: mind_search first for broad discovery (cheap), then mind_recall or mind_query for full content on the specific memories worth diving into. Token-efficient retrieval, not brute-force dumps.

Semantic Memory Layer — Architecture
shared-memory OS · multi-agent swarm · self-updating knowledge base
~3,000chunks indexed
40+repos covered
11,000+memories stored
AGENTS
23 agents · query-first at session start
Claude Code · Agent Gateway · Feature Teams · Swarms
↓
MCP INTERFACE
~10 tools · progressive disclosure: cheap search → deep fetch
mind_querymind_searchmind_remembermind_recallmind_forgetmind_contextmind_status+3 more
↓
Semantic Memory API — FastAPI
embed on write · semantic search on read
:8080Docker414 tests · 92% cov
↓
ChromaDB
Vector Store
cosine similarity
ANN retrieval
Embeddings
sentence-transformers
all-MiniLM-L6-v2
384-dim
LOCAL — no ext. API
SQLite
Metadata Store
namespaces · types
lifecycle · tags
DOCUMENTATION SYNC ENGINE
feedback loop
PR merge
↓
diff extraction
↓
synthesis
↓
mind_remember
↺ auto-update
knowledge base
updates itself
from shipped code
Local, structured, self-updating. A swarm has parallelism; a mind has memory.

The Architecture Decision That Made It Work

The part I'm most proud of isn't the search — ChromaDB handles that well. It's the structure that prevents Akashic from becoming a junk drawer.

Memories have types. They have namespaces. They have lifecycle.

A lesson learned from a bug in the trading bot lives in the trading namespace, typed as lesson. A project architecture decision lives in the relevant repo's namespace, typed as architecture. Daily operational logs live in global, typed as log. This isn't just organization — it's governance. It determines which memories surface for which queries, which agents have write access to which namespaces, and when memories should be expired or superseded.

Without this structure, shared memory degrades fast. Every agent dumps whatever it wants, the index fills with noise, retrieval quality collapses, and agents stop trusting the system. I've seen this happen with teams using plain vector DBs with no schema. Six months in, nobody queries the index because the results are garbage.

Structure is what makes memory valuable over time, not just at insertion.

The difference between a knowledge system and a junk drawer is governance. Without explicit types, namespaces, and lifecycle rules, every shared memory system eventually becomes the latter.

Thoth: The Agent That Keeps It Current

A memory system that requires manual updates is a memory system that decays. The moment adding a memory becomes a task — something to remember to do, something to prioritize — it gets skipped under deadline pressure, and the whole thing starts going stale.

So I built an agent called Thoth to solve the update problem automatically.

Thoth watches for PR merges across the org. When a PR lands, Thoth reads the diff, synthesizes the changes into structured knowledge — what changed, why it changed, what the relevant implications are — and writes that knowledge directly into Akashic. The knowledge base updates itself from code changes, without requiring anyone to file a lesson manually.

This closes the loop that most agent memory systems leave open. You can have great retrieval infrastructure, but if insertion is a human-driven process, you'll always be behind. Engineers forget to document. Agents don't naturally write post-mortems unless prompted. Thoth removes the dependency on anyone remembering to maintain the system — the act of shipping code IS the act of updating memory.

The result is a knowledge base that gets richer every time we ship, not one that requires dedicated maintenance cycles to stay relevant.

What Changed When Agents Got Memory

I want to be specific here because "it got smarter" is too vague to be useful.

The asyncio bug? Never repeated. The lesson is in Akashic, tagged, indexed, retrievable. The first thing a new agent does when starting a Python task in this codebase is mind_query for relevant lessons and architectural context. The asyncio pattern comes back in the results. The agent reads it, flags it in its working context, and the bug never happens again.

Deployment gotchas compound forward instead of resetting. The Mac Mini requires an explicit PATH export in any non-interactive SSH session or Docker is not found. That lesson exists once in Akashic. Every deployment-touching agent queries it at session start. We've never hit that failure again after documenting it the first time.

Cross-repo knowledge transfers. The Tesseract Intelligence competitive intelligence system and the trading bots share zero code, but they share the same Akashic index. When the trading team learned something about Phemex's order unit semantics — that a quantity parameter means base coin, not contracts — that lesson is queryable by any agent in any repo. An agent building a new exchange integration queries "what did we learn about exchange order sizing" and gets the relevant warnings before writing a single line of code that touches an order.

Session startup is transformed. Before Akashic, every agent started cold — reading CLAUDE.md, maybe skimming recent git history, reconstructing context from scratch. Now, the session-start protocol is mind_context first: pull the architecture summary, the relevant lessons for this domain, the recent operational events. Agents arrive informed. The quality of the first decision they make is materially better than it was when they started from zero.

This is the difference between a new employee reading the wiki versus inheriting the institutional memory of every engineer who worked on the system before them.

The Query-First Habit

The leverage point isn't the storage — it's the retrieval discipline.

A memory system only pays off if agents actually use it. The failure mode is building perfect infrastructure and having agents ignore it because querying feels optional. Query-first has to be a hard habit, not a suggestion.

My current session-start protocol for any non-trivial agent task:

mind_query "{project}" — pull architecture and context for this codebase
mind_search "lessons {domain}" — broad sweep for relevant lessons
Read CLAUDE.md for the specific repo
Then and only then, start executing

This adds maybe 90 seconds to a session start. In return, the agent arrives with the institutional knowledge of every previous agent that touched this system. The compounding return on that 90 seconds is enormous — but only if it's non-negotiable. The moment it becomes "I'll query if I think I need it," agents start skipping it under time pressure and you're back to stateless execution.

Make it a gate, not a suggestion. The query happens before the first bash command, not when the agent hits a problem it could have avoided.

Memory Is What Turns a Swarm Into a Mind

I think about this through the lens of Rewired Minds — the way mental models compound when they're actually integrated rather than siloed. An organization that can't transfer knowledge between its members isn't an organization. It's a collection of individuals who happen to share a payroll.

The same principle applies to AI agents.

Twenty-three agents with separate context windows are twenty-three separate problem-solvers. They can run in parallel, execute quickly, cover a lot of ground simultaneously. But without shared memory, they don't compound. Every lesson costs the same to learn regardless of how many times the org has already learned it. Every mistake is fresh. Every investigation starts from zero.

Add shared, structured memory and the dynamic changes completely. Lessons become permanent org property the moment they're encoded. An insight learned on day one is still accessible — and contributing to better decisions — on day three hundred. The org gets smarter as it operates, not just as it's configured.

This is the property that separates a swarm from a mind: a swarm has parallelism, a mind has memory. Both matter. But parallelism without memory is just a faster way to repeat yourself.

The Technical Decisions That Would Change If I Started Over

For anyone building something similar, the choices that mattered most in hindsight:

Local embeddings over API embeddings. Running all-MiniLM-L6-v2 locally means zero per-query cost, zero data leaving the machine, and zero latency spike when the embedding API is rate-limited. The quality is genuinely sufficient for this use case — 384 dimensions captures enough semantic resolution to distinguish "asyncio cancellation bug" from "docker networking issue." For a private knowledge base, the tradeoffs are almost entirely in favor of local.

ChromaDB for the vector store. It's simple, it runs embedded, it has solid Python bindings, and it doesn't require standing up a separate service. At thousands of chunks, query latency is under 100ms. The cosine similarity retrieval quality is solid. I'd make the same choice again.

Namespace governance from day one. I thought about adding namespaces later and it would have been a nightmare migration. Structure at insertion is much cheaper than restructuring a live index. Define your types and namespaces before you write the first memory.

Progressive disclosure retrieval. The pattern of mind_search (cheap broad pass) followed by mind_recall (full content fetch on specific IDs) is meaningfully more token-efficient than always pulling full content on every query. When you're running dozens of agent sessions per day, that efficiency adds up.

What's Next

The immediate roadmap: confidence decay on time-sensitive memories. Right now, a memory about an API's pagination behavior from six months ago is weighted the same as one from last week. For stable architectural knowledge, this is fine. For operational state and API behavior, it's not — things change. The next version of Akashic will score memories by recency and explicit freshness signals, surfacing a confidence indicator alongside each result so agents know when to verify rather than trust.

The longer arc: memory as a first-class org asset. Every significant architectural decision, every hard-won lesson, every investigation that cost real hours should be encoded in a form that's retrievable, structured, and legible to any agent that comes after. Not because it's good hygiene — though it is — but because the compounding return on institutional memory is one of the few genuine moats available when the raw capability of any individual agent is commoditizing fast.

The agents that win won't be the ones with the biggest context windows. They'll be the ones with the deepest memory.

The asyncio bug that cost two hours the first time? It costs zero hours now. That's the unit economics of shared memory: pay once, deploy the lesson forever. Do that across thousands of lessons, across dozens of repos, across hundreds of agent sessions, and you have something that looks less like a collection of tools and more like an organization that actually learns.

That's what I was building toward. That's what Akashic Records is.

The mystics were chasing the right instinct with the wrong instruments. They imagined a perfect, permanent memory of everything, readable on demand by anyone who knew how to ask — and then reached for the astral plane to get it. A century later, the record is real. It just runs on a Mac Mini, and you read it with a query instead of a trance.

A swarm has parallelism. A mind has memory. Now I have both.