Why We Built Agent Memory Markdown-First

TL;DR

We built AI agent long-term memory markdown-first: each agent writes durable facts to plain markdown files, and a semantic index sits on top for fast recall. The reason is durability — markdown is human-readable, git-diffable, and degrades gracefully (if the index breaks, the files still work). A vector database can't say the same, and it's the wrong fit when the agent's runtime is already Claude Code.

When you add long-term memory to an agent, the tempting first move is to reach for a vector database or a full memory framework. We went the other way: plain markdown files first, with a semantic index layered on top. This post is the argument for that ordering — why the source of truth should be files you can read, and the index should be a removable accelerator.

What “markdown-first” means

The design is simple. Each agent has a memory.md. The rule it follows:

Read it at the start of a task — recall what you already know before starting.
Append durable facts as you learn them — a decision, a convention, a gotcha — in small, atomic entries.

## build
- The web bundle is built with electron-vite, NOT plain vite. Use `npm run build`.
- node-pty needs an electron-rebuild after install (the postinstall handles it).
- Never hand-edit docs/index.html during a release — it's the REL fallback source.

A semantic index is built from these files so the agent can recall by meaning instead of re-reading everything (semantic memory for AI agents covers that layer). But the markdown is primary. The index is derived, and disposable.

Three reasons the ordering matters

1. Memory you can read is memory you can trust

Open an agent’s memory.md and you see exactly what it “knows” — in your language, not as a blob of floating-point vectors. That’s not a nicety; it’s how you debug an agent that’s acting on a stale or wrong belief. With markdown, you find the bad note and fix it. With opaque embeddings, you’re guessing at what the agent remembers and re-indexing to change it. The most important property of a memory system is that a human can audit it, and plain text wins that outright.

2. Memory that diffs is memory you can review

Because the notes are files, they live in git. Every change to what an agent knows is a diff you can read in a commit — memory becomes reviewable like any other artifact. You can watch an agent’s understanding evolve, catch the moment it learned something wrong, and roll it back. A database of embeddings has no meaningful diff; “the vectors changed” tells you nothing. Text under version control turns memory from a black box into a reviewable record, which pairs naturally with the append-only event log for full auditability.

3. Memory that degrades gracefully never strands you

This is the durability clincher. If the semantic index breaks, isn’t installed, or you move to a machine without it, the markdown files still work — the agent reads them directly. The fancy recall is an upgrade, not a dependency. Contrast a database that won’t open: it takes the memory with it. With markdown-first, the floor never falls out from under the agent; the worst case is slower recall, not lost knowledge.

Why not a vector database or a memory framework?

Vector DBs and dedicated memory frameworks are real tools with real uses. They’re the wrong default here for two reasons.

They store what you can’t read. Their native unit is the embedding — great for similarity search, useless for human inspection or git review. You end up bolting a “show me the source text” layer back on, which is just markdown wearing a costume.

They want to own the runtime. Many agent-memory frameworks assume they are the thing running the agent loop — they expect to be in charge. But when your agents are real claude sessions, the runtime is already Claude Code. The memory layer should be a lightweight companion beside it, not a framework that wants to replace it. Markdown files plus a CLI-driven index fit that shape exactly: the agent writes notes as a normal part of its work, and a separate process mines those notes into a searchable store without intruding on the loop.

That mining is the bridge: the index reads each agent’s markdown and makes it recall-able by the whole team, on a schedule, only re-processing files that changed. You get vector-quality recall and human-quality source text — because the vectors are derived from the text, not instead of it.

The single-writer detail

One subtlety that markdown-first gets right almost for free: one writer per memory file. Each agent owns its own memory.md and only it appends there. A shared index aggregates everyone’s notes read-only, so the whole team can recall across agents without any file being written by two processes. It’s the same single-writer discipline that keeps multi-agent git safe — applied to memory. Co-edited memory files would tear; per-agent files plus a read-only shared index don’t.

When markdown-first is the right call

This approach shines when:

you want to audit and trust what your agents remember,
you want memory under version control so you can review and revert it,
your agents’ runtime is something you don’t control (a CLI like Claude Code), and
you value graceful degradation over maximal indexing sophistication.

It’s less compelling if you’re building a system at a scale where you genuinely need a dedicated vector store’s throughput and you’re willing to give up readability for it. For a hive of a handful to a few dozen agents, markdown-first is the durable, debuggable default. For the plain-English version of why agents forget and what shared memory changes, see how to give Claude Code long-term memory.

FAQ

Doesn’t markdown get unwieldy at thousands of notes? The files stay fine; the challenge is recall, which is exactly what the semantic index on top solves — you recall the few relevant notes by meaning rather than reading them all. The markdown scales as storage; the index scales as retrieval.

Can I edit an agent’s memory by hand? Yes — it’s a text file. Fix a wrong note, remove a stale one, and the next mining pass picks up the change. Try that with a vector store.

Munder Difflin gives every agent a plain memory.md plus a shared semantic palace mined from it — readable, diffable, and graceful when the index is gone. Download Munder Difflin to give your agents memory you can actually read; it’s free and open source.

FAQ

What is markdown-first agent memory?

It's an approach where each agent's durable knowledge lives in plain markdown files it reads on startup and appends to as it learns. A semantic index is built on top of those files for fast recall — but the markdown, not the index, is the source of truth.

Why not just use a vector database for agent memory?

A vector DB stores opaque embeddings you can't read or diff, and many memory frameworks want to own the agent's runtime. Markdown-first keeps memory human-readable and git-diffable, treats the index as a removable accelerator, and fits agents whose runtime is already Claude Code.

What happens if the semantic index breaks?

Nothing catastrophic — the agent falls back to reading its markdown directly. Because the files are the source of truth and the index sits on top, a broken or missing index degrades to plain markdown memory instead of losing everything.

What “markdown-first” means #

Three reasons the ordering matters #

1. Memory you can read is memory you can trust #

2. Memory that diffs is memory you can review #

3. Memory that degrades gracefully never strands you #

Why not a vector database or a memory framework? #

The single-writer detail #

When markdown-first is the right call #

FAQ #

FAQ