The Real Problem Isn’t “One AI Forgetting”

AI’s biggest weakness: when the session ends, memory is wiped clean.

But worse than one AI forgetting is an entire AI team forgetting.

Our team has 6 AI Agents running on 4 different platforms. Every Agent wakes up with zero memory of what it did yesterday — let alone what its teammates are working on. Imagine six coworkers showing up to work every morning with complete amnesia. Yesterday’s specs, yesterday’s bugs, yesterday’s decisions — all need to be explained from scratch.

And these six coworkers don’t even speak the same language. Claude’s memory lives in text files, OpenClaw’s in a vector database, Gemini’s in session history, Dify’s in a knowledge base. Four completely different memory systems. How do you get them to talk to each other?

We spent months figuring this out, building a complete memory architecture through trial and error. This article takes apart every layer.


Meet the Team

RoleModelFunctionPlatform
J (COO / Tech Director)Claude OpusTask dispatch, code review, complex development, memory maintenanceClaude Code
Mimi (Marketing Manager)MiniMax M2.7Market research, marketing content, translation, promotionOpenClaw
Ada (Product Engineer)MiniMax M2.7Frontend development, bug fixes, deployment, testingOpenClaw
Lily (Content Director)Claude SonnetQA review, document proofreading, style consistencyClaude Code CLI
Moongg (QA Researcher)Gemini CLIQuality assurance, research, frontend QAGemini + OpenClaw
Pipeline (Automation)Gemini FlashNews summaries, data collection, content curationDify Workflows

The human CEO does exactly two things: read reports and make decisions. Under 30 minutes a day. Everything else runs autonomously through the Agent team.

Let’s start from each Agent’s memory system and work our way up to the team-wide shared layer.


Claude Code Agent Memory (J, Lily)

Claude Code is our most memory-rich platform. J, as COO, carries 51 files totaling roughly 155KB of memory.

Layer 1: Auto Memory — The AI Remembers on Its Own

Claude Code has a built-in auto-memory feature. When the AI observes important facts, preferences, or decisions during a conversation, it automatically writes them to MEMORY.md.

The key characteristic is passive accumulation. You don’t need to say “remember this” — the AI decides what’s worth keeping. But capacity is limited: anything beyond 200 lines gets truncated. So this layer holds only compressed indexes and key pointers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# MEMORY.md structure (compressed format, designed for AI parsing)
CONNECTIONS: → memory/connections.md
IRON-RULES: → memory/iron-rules.md

<team>
  j: Opus|COO|dispatch+review+memory maintenance
  mimi: MiniMax M2.7|marketing|Docker
  ada: MiniMax M2.7|engineering|Docker
  lily: Claude Sonnet|content|CLI subscription
  moongg: Gemini CLI|QA|standalone service
</team>

<decisions>
  Simple dev → Ada, J only handles complex tasks
  All tasks must have a Linear card
  Gemini API banned, Moongg uses subscription version
</decisions>

The pipe-delimited format (|) isn’t for aesthetics — it’s for AI parsing efficiency. Each line is a self-contained unit of context.

Layer 2: CLAUDE.md — Behavioral Rules

Placed in the project root, automatically loaded at every session start. This isn’t memory — it’s a behavioral gate. No matter what the AI wants to do, these rules always take priority.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
<gates label="Priority: gates > rules">

GATE-1: Safety > Speed
  checks: SQL-injection | XSS | command injection | sensitive data leaks

GATE-6: Anti-fabrication verification
  trigger: Agent reports completion
  action: Independently re-run at least 1 item
  iron: A PASS without command output = automatically untrusted

GATE-9: Anti-deflection
  pattern: "probably" | "please check manually" | "should be fine"
  action: Automatic FAIL

</gates>

GATE-6 was born from pain. One time an Agent reported “all tests passed” — but checking the logs revealed it never actually ran the tests. It just generated the words “passed.” After adding the anti-fabrication gate, every completion report must include verifiable evidence.

Layer 3: Memory Files — 28 Topic-Specific Files

Auto Memory has limited capacity. CLAUDE.md holds rules. The real deep memory is handled by Memory Files — 28 topic-specific files, each covering a distinct knowledge domain:

TypeExample FileContent
Iron Rulesiron-rules.md (9.2KB)Each rule tagged with its source event and lesson
Infrastructureautomation-infra.md (9.8KB)Automation flows, scheduling, security architecture
Team Knowledgeteam-and-products.md (2.8KB)24 products, pricing, owners
Trading Strategyjudy-crypto.md (7.4KB)Position limits, strategy parameters, risk controls
Patrol Playbookcoo-playbook.md (5.0KB)COO’s daily 5-phase patrol procedure
Connection Mapconnections.md (6.8KB)Agent ecosystem relationships, APIs, tool dependencies

The design principle is load on demand. The AI doesn’t read every file on startup — it only reads judy-crypto.md when it needs trading strategy. The index file stays under 200 lines; deep knowledge is distributed across topic files.

Layer 4: Rules Layer Inheritance

Claude Code rules have a four-layer inheritance chain:

1
2
3
4
5
6
7
Base rules (common/)
    ↓ overridden by
Language rules (python/, typescript/)
    ↓ overridden by
Project rules (CLAUDE.md)
    ↓ overridden by
Session context (MEMORY.md auto-injected)

We have 14 rule files covering security, performance, testing, coding style, and Git workflows. Base rules define universal standards, language rules add Python or TypeScript specifics, and project rules layer on this team’s gates.

Layer 5: Hooks — Auto-Triggered Guards

Claude Code’s Hooks system lets us auto-trigger scripts before and after tool execution:

  • Pre Hook: Before executing a bash command, intercept dangerous operations (rm -rf, git --force)
  • Post Hook: After writing a Python file, auto-run syntax checks; auto-scan blog content for security issues
  • Stop Hook: Triggered before session end to run a learning evaluation — prompting the AI to review what it learned and extract reusable patterns into memory
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "hooks": {
    "Stop": [{
      "matcher": "*",
      "hooks": [{
        "type": "command",
        "command": "bash hooks/stop-learn-eval.sh"
      }]
    }]
  }
}

The Stop Hook closes the memory loop. Without it, the AI finishes its work and leaves — everything it learned vanishes with the session. With it, every session’s lessons have a chance to crystallize into permanent memory.


OpenClaw Agent Memory (Mimi, Ada)

Mimi and Ada run on the OpenClaw platform using the MiniMax M2.7 model. Their memory system is completely different from Claude Code.

SOUL.md — Agent Personality Definition

Each OpenClaw Agent has a SOUL file that defines its role, permissions, and behavioral boundaries. Similar in function to CLAUDE.md, but different in format:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Ada's SOUL.md (simplified)
role: Product Engineer
reports_to: J (COO)
tech_stack: Next.js, TypeScript, Tailwind, Supabase, Python

authority:
  green: Bug fixes (handle independently)
  yellow: New features (requires J's review)
  red: Production deployment, data deletion (requires explicit authorization)

quality_gates:
  1. Self-review → 2. J Code Review → 3. Moongg QA ≥ 8.5

security: HIGHEST PRIORITY
  - Never leak API Keys
  - Never disclose information without authorization

SOUL lets the same underlying model play completely different roles. Mimi and Ada are both MiniMax M2.7, but with different SOULs — one does marketing, the other does engineering.

MEMORY.md — Auto-Evolving Work Memory

Unlike Claude Code’s Auto Memory, OpenClaw’s MEMORY.md is updated by an external evolution system. At a fixed time each day, the evolution script scans the Agent’s work logs and writes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Project Experience
- CoinSifter v1.8.0: Pro version in Beta testing
- CryptoLog: Phase 2 complete, Phase 3 in development

## Lessons Learned
### 2026-03-22 — GATE-6 Failure
- Reported "success" but git diff showed zero modifications
- Lesson: Must confirm actual changes before reporting success

### 2026-04-01 Auto-Evolution Record
- Tasks completed: 7
- Performance: Output volume sufficient | No rejections

The Agent doesn’t need to remember “what did I do yesterday.” The evolution system remembers for it.

SQLite + FTS5 — Structured Memory Database

Each OpenClaw Agent has its own SQLite database storing vector embeddings and full-text search indexes:

1
2
3
4
5
6
7
8
9
CREATE TABLE chunks (
  id TEXT PRIMARY KEY,
  path TEXT,           -- memory file path
  text TEXT,           -- content
  embedding TEXT,      -- vector embedding (for semantic search)
  hash TEXT,           -- content hash (for change tracking)
  updated_at INTEGER
);
CREATE VIRTUAL TABLE chunks_fts USING fts5(text, id, path);

This lets Agents perform semantic search and keyword search over their own memory without reading every file each time.

LanceDB — Team-Level Vector Memory

Beyond SQLite, OpenClaw also has a LanceDB vector database as a shared memory layer:

  • autoCapture: Key information from conversations is automatically vectorized and stored
  • autoRecall: When relevant context appears, semantically similar memories are automatically recalled

The difference from text memory: vector memory uses semantic search. You don’t need the exact keyword — if the meaning is close enough, it gets recalled. When Mimi researched a competitor last time, she’ll automatically recall it the next time a similar product comes up.

LanceDB uses Apache Arrow format for storage, supporting transactionally safe updates and multi-Agent shared reads.

Session Logs — Complete Conversation History

Every time an Agent executes a task, the complete conversation is saved in JSONL format:

1
2
3
4
{"type":"session","id":"02269028-...","timestamp":"2026-04-01T05:38:00Z"}
{"type":"model_change","provider":"minimax","modelId":"MiniMax-M2.7"}
{"type":"message","role":"user","content":[{"type":"text","text":"Fix the homepage bug"}]}
{"type":"message","role":"assistant","content":[{"type":"toolCall",...}]}

This isn’t for real-time Agent consumption (too large) — it’s for the COO to trace back after the fact. When an Agent has a problem, you can replay its complete thought process from the session logs.


Gemini CLI Agent Memory (Moongg)

Moongg uses Gemini CLI, with a memory mechanism different from both previous platforms. Her most distinctive trait: she operates independently, outside the central scheduler’s control.

Shared Components with OpenClaw Agents

Moongg also has SOUL.md, MEMORY.md, and IDENTITY.md, structured the same as OpenClaw Agents:

  • SOUL.md (5.5KB) — Role definition, QA standards, security rules
  • MEMORY.md (2.5KB) — Recent QA work, lessons, performance records (auto-updated daily)
  • IDENTITY.md — Name, timezone, associated Telegram Bot

Gemini’s 30-Day Session Retention

Gemini CLI has built-in session history retention:

1
2
3
4
5
6
7
// Gemini CLI configuration
{
  "sessionRetention": {
    "enabled": true,
    "maxDays": 30
  }
}

The full conversation history from the past 30 days is preserved locally. This means Moongg can recall conversation details from last month in a new session — a capability the other platforms don’t have.

LanceDB Vector Memory (Manual Access)

Moongg can also use the LanceDB vector database, but unlike OpenClaw, her vector memory is manually triggered:

1
2
3
4
5
# Store a memory
agent_memory.py remember moongg "Found CoinSifter layout issue in Safari" knowledge

# Recall related memories
agent_memory.py recall moongg "CoinSifter browser compatibility"

Vectors are generated using the 768-dimensional nomic-embed-text model, with support for filtering by category (task_result, knowledge, plan, etc.).

Independent Operation Mode

Other Agents are woken up on schedule by the central scheduler (Agent Executor). Moongg is different — she runs as a standalone system service, receiving messages in real-time through a Telegram Bot.

This design is intentional: QA needs instant responses, not scheduled runs. When someone asks on Telegram “is this page broken?”, Moongg can reply immediately.


Dify Knowledge Base (Pipeline Agents)

The team includes several automated Pipeline Agents (Xiaojin, Yaya, Mengmeng) running on the Dify workflow platform with the Gemini Flash model. Their memory is unlike all three systems above — it relies on knowledge bases instead of the file system.

89 Documents, 824KB of Knowledge

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
dify-knowledge/
├── sop/              # 23 operational procedures (168KB)
│   ├── INDEX.md           — Master index of all SOPs
│   ├── architecture.md    — Team structure, roles, AI engines
│   ├── blog-pipeline.md   — Complete Blog flow from writing to deployment
│   ├── product-development.md — 9-stage product development SOP
│   ├── quality-gate.md    — QA quality standards
│   ├── task-delegation.md — Task assignment rules
│   ├── monitoring-alerting.md — 5-layer monitoring system
│   └── ... (16 more SOPs)
├── crypto-ta/        # 24 trading technical analyses (212KB)
│   ├── 01-candlestick-patterns.md
│   ├── 07-wfo-validation.md
│   ├── 17-pro-trader-strategy.md
│   └── ... (21 more analyses)
├── marketing/        # Marketing research (272KB)
│   └── viral/             — X, Reddit, Threads, YouTube viral content analysis
└── daily-snapshots/  # 31 daily updates

What the Knowledge Base Does

Dify Agents can’t freely read and write the file system like Claude Code. They query relevant documents through Dify’s knowledge retrieval mechanism within workflows.

For example, when Yaya (the news summary Agent) needs to judge whether a news item is worth reporting, the Dify workflow automatically retrieves relevant trading strategy documents from the knowledge base, helping it make judgments consistent with the team’s strategy.

SOPs Are the Team’s Collective Memory

The most important assets in the knowledge base are the 23 SOPs. These aren’t memory — they’re standardized processes:

  • blog-pipeline.md: Complete Blog flow from topic selection to writing, QA, review, and deployment
  • product-development.md: 9-stage product development SOP (research, spec, development, testing, review, launch)
  • task-delegation.md: Task assignment rules (which types of tasks go to whom)
  • monitoring-alerting.md: 5-layer monitoring system (from real-time alerts to daily summaries)

No Agent needs to remember “what’s the Blog process” — just check the SOP. This transforms individual memory into organizational knowledge.


Team-Wide Shared Memory Layer

Everything above covers each Agent’s “personal” memory. But team collaboration requires a shared layer that everyone can read and write.

SHARED_TASK_NOTES — The Team’s Brain

This is the core of the entire memory system: a single Markdown file shared by all Agents, recording the global work state.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Shared Task Notes — Cross-Session Sync

> Rules:
> 1. First thing every session: read this file
> 2. Before starting a task: write "🔄 Working on MIM-XXX — brief description"
> 3. After completion: change to "✅ MIM-XXX — one-line result"

## 🔄 Session Update (2026-04-01) — Security Patrol + Product Development

### Completed
- Full system security patrol: 10 checks passed
- MCP Security Scanner: CLI MVP + docs + tests 30/30

### Waiting for Judy's Decision
- [ ] Security Scanner pricing confirmation
- [ ] Blog publication approval

800 lines, 50KB. It’s not pretty, but it works. Any Agent that wakes up and reads this file immediately knows: who’s doing what, what’s been finished, and what’s blocked.

bot_inbox — The Task Delivery System

Each Agent has its own inbox directory:

1
2
3
4
5
6
7
8
bot_inbox/
├── j/           # COO's inbox
│   ├── pending/      # To be processed
│   └── done/         # Completed (160,000+ cumulative records)
├── mimi/        # Marketing Agent
├── ada/         # Engineering Agent
├── lily/        # Content Agent
└── moongg/      # QA Agent

Tasks are stored as JSON files with full context:

1
2
3
4
5
6
7
8
9
{
  "type": "linear_task",
  "task_id": "MIM-440",
  "title": "MCP Security Scanner Test Suite Enhancement",
  "description": "Currently 30 tests all passing, need to...",
  "message": "See memory/team-and-products.md for background",
  "by_whom": "J",
  "ts": "2026-04-01T12:34:00Z"
}

The task file itself is the context carrier. When an Agent reads a task, it doesn’t just know “what to do” — it knows why, what was tried before, and which memory files to reference.

ai-logs — Activity Records for Every Agent

Each Agent has its own monthly activity log:

1
2
3
4
5
6
7
8
ai-logs/
├── agent-openclaw/2026-04.md    # Mimi + Ada's OpenClaw operations
├── agent-J-claudecode/2026-04.md # J's Claude Code operations
├── agent-K-dify/2026-04.md      # Dify Pipeline operations
├── j_output_log.jsonl           # Team-wide output ledger
├── monitor_report.json          # System health snapshot
├── heartbeat_data.txt           # Agent heartbeat status
└── .decision_tracker.json       # COO decision tracking

j_output_log.jsonl is the team-wide output ledger. Every Agent logs an entry upon completing a task:

1
2
3
4
5
6
7
{
  "description": "MCP Security Scanner CLI MVP completed",
  "type": "product",
  "agent": "ada",
  "task_id": "MIM-437",
  "ts": "2026-04-01T15:30:00Z"
}

When the COO patrols, reading these logs provides a full picture. No need to ask each Agent “what did you do.”

Linear — Structured Task Memory

We use Linear for task management. It’s not just a tool — it’s also a form of memory:

  • Each card records a task’s complete lifecycle (created, assigned, in progress, reviewed, completed)
  • Card comments serve as the formal communication channel between Agents
  • Labels mark owners and task types
  • Status changes are timestamped, making the entire flow traceable

The Dispatcher (router) scans Linear every few minutes, automatically routing new cards to the appropriate Agent’s inbox. When done, the Agent writes a comment back to the card.


How Memory Flows Between Agents

With individual memory and a shared layer in place, the next question is: how does context get from Agent A to Agent B?

Memory Preamble — Mandatory Pre-Task Injection

The Agent Executor (central scheduler) injects a “memory preamble” before each Agent’s task:

1
2
3
4
5
6
[MANDATORY — Complete before starting task]
1. Read your memory files to confirm identity and rules
2. Confirm you are responding in Traditional Chinese
3. After completing the task, write results to the designated result file — no file = didn't do it
---
[Your task follows below]

This ensures that even if an Agent starts in a completely fresh session with zero context, it will first read its own SOUL.md and MEMORY.md before getting to work.

Result File — Mandatory Reporting

After completing a task, an Agent must write a result file. No result file = didn’t do it.

1
2
3
4
5
6
7
{
  "from": "ada",
  "type": "task_result",
  "task_id": "MIM-440",
  "status": "done",
  "summary": "Test suite expanded from 30 to 45, added boundary tests and error handling tests"
}

The result file goes back to the COO’s inbox. After the COO reviews it, they update SHARED_TASK_NOTES and Linear. The next Agent picking up the thread can seamlessly continue.

The Complete Context Flow

Here’s how context flows through a typical task from creation to completion:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
Judy creates a card in Linear
Dispatcher reads the card, decides who gets it
Writes a JSON task file to the Agent's inbox (with full background)
Agent Executor discovers the new task
Injects Memory Preamble → Launches the Agent
Agent reads SOUL.md + MEMORY.md → reads task → executes
Completes → writes Result File to COO inbox
COO reviews → updates SHARED_TASK_NOTES + Linear comment
Next Agent wakes up → reads SHARED_TASK_NOTES → picks up where things left off

Throughout this entire flow, context never “disappears.” It persists in the file system as JSON and Markdown, read and enriched by one Agent after another.


Automatic Memory Evolution

The biggest risk in any memory system is staleness. A wrong memory is more dangerous than no memory at all.

Daily Evolution System

At a fixed time each day, the evolution script runs automatically:

  1. Scans all Agents’ inbox/done directories (completed items from the past 24 hours)
  2. Categorizes tasks (product, Blog, QA, system maintenance, etc.)
  3. Evaluates each Agent’s performance (output volume, rejection count, quality trends)
  4. Updates each Agent’s MEMORY.md (auto-appends evolution records)
  5. Generates a team-wide evolution report and pushes it to Notion

Evolution report format:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Team Daily Evolution Report — 2026-04-01

## J (COO / Tech Director)
Tasks: 121
✅ Doing well: Output volume sufficient
⚠️ Needs improvement: 1 rejection

## Ada (Product Engineer)
Tasks: 7
✅ Doing well: Zero rejections

## Team Total: 137 tasks, 4 members

This lets Judy spend just 5 minutes a day to grasp the entire team’s operational status.

Iron Rules — Rules Crystallized from Incidents

The most valuable part of the memory system isn’t “what to do” — it’s “what NOT to do.” Every pitfall gets written into an iron rule, tagged with the specific incident:

1
2
3
4
5
6
7
8
9
BLOG-REVIEW-GATE:
  rule: Blog must pass QA ≥ 8.5 → Notion review → Judy confirms → only then deploy
  lesson: MIM-195 went live without QA, incorrect information leaked
  count: Violated 1 time

ANTI-FABRICATION:
  rule: Agent says done → COO independently re-runs verification
  lesson: MIM-242 Ada reported complete → actual execution hit SyntaxError
  count: Violated 3 times

Every iron rule has a lesson= tag. It’s not “I think we should do this” — it’s “we got burned because we didn’t.” Rules crystallized from real incidents have far higher compliance rates than abstract best practices.

COO’s Memory Patrol

The COO Agent runs a fixed patrol sequence every time it starts up:

  1. Read memory (30 seconds) — MEMORY.md + SHARED_TASK_NOTES
  2. Cross-check reality — Does memory match the actual system state?
  3. Clean stale memory — Completed tasks, fixed bugs, changed architecture
  4. Write new lessons — Record this session’s decisions and discoveries back to Memory Files

If memory and reality don’t match (e.g., memory says “service X is running” but it’s actually stopped), the COO immediately updates memory. This prevents stale information from propagating through the team.


Cross-Platform Memory Format Design

Four platforms, four memory mechanisms. How do you make them interoperate?

Lowest Common Denominator Principle

PlatformNative MemoryStrengthLimitation
Claude CodeMarkdown text filesTransparent, human-readable, version-controlledPlain text search, no semantics
OpenClawSQLite + LanceDBSemantic search, auto-recallHard to audit manually
Gemini CLI30-day session historyLong-term conversation continuityCan’t share across Agents
DifyKnowledge BaseVisual management, workflow integrationLimited cross-session state

The answer is straightforward: all cross-Agent communication uses Markdown + YAML.

No matter how advanced an Agent’s native memory is (vector search, session history, knowledge bases), when it needs to communicate with other Agents, it uses the most universal format. SHARED_TASK_NOTES is Markdown, inbox tasks are JSON, result files are JSON.

Any platform’s AI can read Markdown and JSON. That’s the lowest common denominator.

Written for AI ≠ Written for Humans

We discovered that AI processes structured data far more effectively than prose:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Good approach — efficient AI parsing
LINEAR-REPLY:
  flow: inbox → execute → comment on original card → move to done
  violation: Doing the work but not reporting = same as not doing it
  count: Violated 2 times

# Bad approach — low information density
After processing a Linear card, remember to go back
and leave a comment on the original card,
otherwise nobody knows what you did.
We've already forgotten to do this twice before.

Pipe-delimited (|), key-value format, YAML style. Each line is a self-contained unit of context. The AI can precisely parse count: Violated 2 times.


Environment Isolation and Shared Layer

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  OpenClaw     │  │  OpenClaw     │  │  Claude Code  │  │  Gemini CLI  │
│  Mimi(Mktg)  │  │  Ada(Eng)     │  │  Lily(Content)│  │  Moongg(QA)  │
│  MiniMax M2.7 │  │  MiniMax M2.7 │  │ Claude Sonnet │  │  Gemini 2.5  │
│  SOUL+LanceDB │  │  SOUL+SQLite  │  │ CLAUDE.md     │  │  30-day Sess │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │                 │
       └─────────────────┴─────────────────┴─────────────────┘
                   ┌──────────────┴──────────────┐
                   │       Shared Memory Layer    │
                   │                              │
                   │  SHARED_TASK_NOTES.md        │
                   │  bot_inbox/{agent}/          │
                   │  ai-logs/{agent}/            │
                   │  j_output_log.jsonl          │
                   │  Linear API                  │
                   │  Dify Knowledge Base (89docs)│
                   │  Telegram Notifications      │
                   └──────────────────────────────┘

Each Agent has its own memory (upper layer), but everyone shares the same shared layer (lower layer). If one Agent goes down, the others are unaffected. The shared layer uses the most universal formats to ensure cross-platform interoperability.


Complete Memory Stack Overview

LayerMechanismLoaded WhenUsed ByCapacity
Auto MemoryMEMORY.md auto-injectedEvery session startClaude Code Agents~200 lines
Behavioral RulesCLAUDE.md / SOUL.mdEvery session startAll AgentsUnlimited
Topic MemoryMemory FilesLoaded on demandClaude Code Agents28 files / 155KB
Rule Inheritance4-layer Rules overrideEvery session startClaude Code Agents14 files
Vector MemoryLanceDB + SQLiteDuring semantic searchOpenClaw AgentsUnlimited
Session HistoryGemini 30-day retentionAuto-loaded on new sessionGemini CLI Agent30 days
Knowledge BaseDify Knowledge BaseDuring workflow queriesPipeline Agents89 files / 824KB
Progress SyncSHARED_TASK_NOTESEvery session startAll Agents800 lines / 50KB
Task Deliverybot_inbox JSONAgent Executor retrievalAll AgentsUnlimited
Activity Logsai-logs monthly journalsDuring COO patrolAll Agents55MB cumulative
Output Ledgerj_output_log.jsonlOn task completionAll AgentsUnlimited
Auto Evolutionj_team_evolution.pyFixed daily scheduleAll AgentsUpdated daily

Pitfalls We Hit

Pitfall 1: Agent Fabrication

Agents would generate “tests passed” without actually running any tests. Solution: GATE-6 anti-fabrication verification — the COO independently spot-checks, and any PASS without command output is automatically untrusted.

Pitfall 2: Memory Overwrites

Early on, two Agents would modify SHARED_TASK_NOTES simultaneously, overwriting each other’s changes. Solution: Added a locking mechanism in the Agent Executor — only one Agent can execute a task at a time.

Pitfall 3: Deflection Language

When uncertain, Agents would say “I suggest you check manually” or “it’s probably fine.” Solution: GATE-9 automatically marks any report containing deflection language as FAIL. Forces the Agent to figure out the truth itself.

Pitfall 4: Simplified Chinese Contamination

The MiniMax model sometimes outputs Simplified Chinese (we need Traditional Chinese). Solution: Built an 11,000-character Simplified-to-Traditional mapping into the Agent Executor — any output containing Simplified Chinese gets automatically rejected for a redo.

Pitfall 5: Stale Memory

A memory file stated “service X is running” but the service had already stopped. The Agent read stale memory and made wrong decisions. Solution: The COO cross-checks memory against actual system state on every startup, correcting inconsistencies immediately.

Pitfall 6: MEMORY.md Explosion

Without capacity control, MEMORY.md grows endlessly. Once it exceeds 200 lines and gets truncated, the AI only sees the first half. Solution: The index file holds only pointers; deep content is distributed across topic files.


Get Started in 10 Minutes

Want to try this in your own project? A minimum viable memory system needs just three files:

1. CLAUDE.md (project root)

1
2
3
4
5
6
<rules>
TASK-FLOW:
  action: Receive task → read memory → execute → update memory
SAFETY:
  action: Never leak API Keys or internal paths
</rules>

2. memory/MEMORY.md (memory index)

1
2
3
# Memory Index
- [team-rules](team-rules.md) — Team iron rules
- [project-context](project-context.md) — Project progress

3. memory/team-rules.md (topic memory)

1
2
3
All externally published content must pass quality review.
**Why:** Previously published without review, causing incorrect information to leak.
**How:** Content complete → QA ≥ 8.5 → only then publish.

Get it running first, then add layers based on real needs. Start with three files — don’t try to build a 155KB memory architecture on day one.


References

AI × Trading Complete Bundle — Course + Commander Handbook
$59 Save $4.90 · Bilingual · Lifetime updates
Get Bundle →