Reverie: A Neuroscience-Grounded Memory Consolidation Framework for LLM Coding Harnesses #

Status: Skeleton (TOD-402). Benchmark numbers land in TOD-411.

1. Abstract #

Reverie is a memory framework for LLM coding harnesses that treats long-term knowledge as a tiered cache hierarchy rather than a flat vector store. It combines a six-type, five-layer placement taxonomy (where each piece of knowledge belongs and why) with an offline dream cycle (scan, classify, place, consolidate, prune, sync) modeled on biological sleep consolidation. This paper documents the framework, the empirical audit that motivated it (105 observations, 62% tombstone rate), the SOTA survey that situates it against EverMemOS, CORE, Letta, A-MEM, Zep, and Mem0, and the evaluation plan against LoCoMo and LongMemEval. The 150-word elevator pitch goes here once the placeholder text is replaced with the final framing — covering placement taxonomy, dream cycle, and the LoCoMo wins TOD-411 will report.

2. Motivation #

Modern LLM coding harnesses (Claude Code, Cursor, Windsurf) accumulate knowledge across sessions but lack a theory of where that knowledge belongs. The default pattern — dump everything into one vector store — degrades over time through duplication, misplacement, noise, and staleness. A full audit of the author’s own engram-era memory stack (105 observations, 7 auto-memory files, 140 Obsidian notes, CLAUDE.md, three rules files) found a 62% ID tombstone rate from over-aggressive write-then-delete churn, nine cross-layer duplicates of a single rule (“Rust by default”), 14 observations with the wrong project tag, and behavioral directives stranded in search-only layers where they were never loaded. The binding constraint is not storage capacity but instruction-layer capacity (~200 lines of CLAUDE.md before adherence drops). Placement is a zero-sum game at the top of the hierarchy.

3. Placement taxonomy #

Reverie classifies every piece of knowledge along two axes: a type (one of six knowledge categories) and a persistence layer (one of five tiers analogous to a CPU cache). The decision tree from obs #272 routes each new observation to exactly one home, eliminating dual-write duplication. The six knowledge types are: (1) session-loaded directives, (2) user feedback corrections, (3) user preferences/identity, (4) deep reference knowledge, (5) curated principle collections, and (6) project decisions/architecture/bugs. The five layers — registers, L1, RAM, disk, cold store — map onto CLAUDE.md, auto-memory, the reveried SQLite store, the Obsidian vault, and code+git respectively. A seventh implicit rule (“derivable from code → don’t store”) is unique to coding-harness memory and absent from every competing system surveyed.

4. Architecture #

Reverie ships as reveried, a single-process daemon that wires four crates (reverie-store, reverie-gate, reverie-dream, reverie-sync) behind one MCP and HTTP surface. Every write flows through the write-gate pipeline, which classifies the knowledge type, checks for cross-layer duplicates, enforces the derivability rule, and either accepts the write into the staging tier or rejects it with a placement suggestion. A separate dream runner wakes on a four-tier schedule (session-end, nightly, weekly, monthly) and walks the consolidation pipeline. The architecture deliberately keeps the fast path (gate → staging) and the slow path (dream → consolidated store) on separate code paths — direct writes to the consolidated tier would cause the catastrophic interference that CLS theory predicts.

5. The dream cycle #

The dream cycle is a six-phase offline pipeline that runs while the harness is idle: scan (priority queue ordered by recency × access × importance × novelty, not FIFO), classify (assign knowledge type via the placement taxonomy), place (route to the correct layer), consolidate (gist extraction, schema interleaving, reconsolidation on access), prune (SHY-style global proportional decay, archive then delete), and sync (push canonical copies into Obsidian / auto-memory / CLAUDE.md without re-introducing duplicates). Each phase maps onto a specific neuroscience mechanism documented in obs #280: SWR replay drives the priority queue, systems consolidation drives the staging-to-consolidated promotion, schema theory governs the classify/place decisions, synaptic homeostasis (SHY) governs the prune phase, and reconsolidation makes every read a write opportunity.

6. Evaluation #

Reverie is evaluated on three axes: (1) LoCoMo F1 (Maharana et al., arXiv:2402.17753) — 50 conversations, 305 turns average, 7,512 questions across single-hop / multi-hop / temporal / commonsense / adversarial — with the observation-RAG top-5 baseline at 41.4% and the current SOTA leaderboard ranging from Mem0 (66.9%) to EverMemOS (92.3%); (2) LongMemEval (ICLR 2025, 500 questions, up to 1.5M tokens) for long-horizon stress; and (3) write-churn reduction measured against the 62% tombstone-rate baseline from the engram-era audit. Ablation graphs will isolate the contribution of the gate, the dream cycle, hybrid search, and entity resolution. Final numbers land with TOD-411.

The 2025-2026 LLM-memory landscape splits into four families: (1) brain-inspired consolidation (EverMemOS 92.3%, Hindsight 89.6%, A-MEM with Zettelkasten reconsolidation), (2) temporal knowledge graphs (CORE 88% with temporal PageRank, Zep/Graphiti 94.8% DMR with a 4-timestamp validity model, Remembra with entity resolution + temporal decay), (3) OS-style virtual context (Letta/MemGPT ~83%, modeled on virtual memory paging), and (4) production CRUD pipelines (Mem0 66.9% with explicit ADD/UPDATE/DELETE/NOOP, LangMem 58.1% with procedural self-modification). Reverie sits closest to family (1) but borrows the validity-interval idea from Zep and the procedural-memory idea from LangMem. The unique contribution is the derivability rule and the placement taxonomy itself — no surveyed system asks “should this be stored at all?” before writing.

8. Anti-patterns #

This section catalogs the failure modes a write-gate prevents — drawn from the engram-era audit (obs #270) and the placement-framework anti-patterns list (obs #272). The dominant pattern is dual-write intent without dedup: a directive instructs the agent to “save to engram AND Obsidian,” and without a gate that recognizes the cross-layer relationship, every save creates a parallel pair that drift apart over time. The audit found this pattern produced five duplicate Obsidian note pairs from a single sync pass, three copies of the same engineering-principles document, and a user profile split across three observations in three different project scopes. Each anti-pattern is paired with the gate rule that catches it.

9. Open-source release notes #

Reverie is released under [TBD license] as a drop-in replacement for engram (byte-identical wire compat on the MCP and HTTP surfaces). Existing engram users can swap binaries without touching their database — reveried reads the same ~/.engram/engram.db and exposes the same MCP tool names. New users get a one-line install, a reveried init command that scaffolds a config file, and a reveried dream --dry-run command that previews what the consolidation pass would do without writing. This section will document install, configuration, the MCP/HTTP API surface, and the migration path from engram once the v0.1 release ships.