Architecture

The mem0 Memory Service acts as the central memory layer for all OpenClaw agents. It receives diary data through a pipeline (real-time diary write via openclaw-plugin → digest → dream), distills it into semantic memories using AWS Bedrock, and serves relevant context back to agents on demand.

Deployment Architecture

Port mapping: 0.0.0.0:8230 → container:8230. The cli.py on the host connects via localhost:8230.
Volume mounts:
${OPENCLAW_BASE}:/openclaw — pipeline reads host diary files written by openclaw-plugin (read-write)
./data:/app/data — pipeline writes offset files and logs
${AWS_CONFIG_DIR}:/root/.aws — AWS credentials (read-only, optional when using IAM Role)
AWS credentials: On EC2, both containers use the instance IAM Role via IMDS automatically. Outside EC2, set AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY in .env.

Component Responsibilities

Component	Role
openclaw-plugin (`agent_end` hook)	Fires after each agent turn. Writes the conversation to the agent's daily diary file (`~/.openclaw/workspace-{agentId}/memory/YYYY-MM-DD.md`) in real-time via `writeDiaryEntry`. Filters noise (greetings, short exchanges) via `isNoise` and `cleanContent`. Routes diary entries by `agentId` to each agent's workspace. Does not write to mem0 directly — mem0 ingestion is handled entirely by auto_digest.
auto_digest.py --today	Runs every 15 minutes. Reads only the new bytes added since the last run (tracked via `auto_digest_offset.json`). Sends content to mem0 in section-aligned batches (up to ~50KB each, aligned to `##` diary section boundaries) with `infer=True` — mem0 runs internal fact extraction to produce concise memories. Offset is persisted after each successful batch, enabling crash-safe resume.
memory_sync.py	Runs daily at UTC 01:00. Syncs each agent's `MEMORY.md` (curated knowledge) directly to mem0 long-term memory. Hash-based dedup skips unchanged files — zero LLM cost if nothing changed.
auto_dream.py / AutoDream	Runs daily at UTC 02:00. Step 1: reads yesterday's complete diary → `mem0.add(infer=True, no run_id)` → long-term memory. Step 2: for each 7-day-old short-term memory, calls `mem0.add(infer=True, no run_id)` — mem0 LLM compares against existing long-term memories and returns a decision: `ADD` (new knowledge, write), `UPDATE` (merge with existing), `DELETE` (redundant, skip write), or `NONE` (already covered). Regardless of decision, the original short-term entry is always deleted after processing.
mem0 Memory Service	Core service. Uses AWS Bedrock LLM for memory distillation/deduplication and Bedrock Embedding for vectorization.
Vector Store	Persists memory vectors. Supports S3 Vectors, OpenSearch, or pgvector as the backend. Score normalization is applied at the service layer — see Score Normalization Layer below.
SKILL.md → Retrieval	On new agent sessions, reads SKILL.md, queries mem0 for relevant memories, and injects them as context.

Score Normalization Layer

Vector stores return scores with inconsistent semantics — OpenSearch returns similarity (higher = better), while pgvector and S3 Vectors return cosine distance (lower = better).

The service applies a normalization layer in server.py (_normalize_scores()) that converts all scores to a unified similarity scale [0, 1] before min_score filtering, time-decay blending, or returning results to callers. This abstraction ensures that upstream logic (ranking, filtering, audit logging) is vector-store-agnostic.

Pipeline Timeline (UTC)

Real-time     openclaw-plugin     — agent_end hook → diary files  (no mem0 write)
Every 15 min  auto_digest --today — diary new bytes → mem0 short-term  (infer=True, fact extraction)
01:00         memory_sync         — MEMORY.md → mem0 long-term  (curated knowledge, instant)
02:00         auto_dream          — Step1: yesterday diary → long-term (infer=True)
                                    Step2: 7-day-old short-term → re-add (infer=True) + delete

Memory Tiering: Who Decides Long vs. Short-Term?

mem0 itself has no concept of short-term or long-term — it stores everything permanently by default. The distinction is entirely controlled by whether run_id is present when writing.

	Short-term	Long-term
`run_id`	`YYYY-MM-DD` (date string)	absent
Written by	`auto_digest.py --today` (automated)	Agent explicitly, `memory_sync.py`, or `auto_dream.py` / AutoDream (consolidated via infer=True)
Lifetime	7 days → consolidated by auto_dream	Permanent
Typical content	Daily discussions, task progress, temp decisions	Tech decisions, lessons learned, user preferences

Three paths to long-term memory

Path 1 — memory_sync.py (daily UTC 01:00, from MEMORY.md)

Each agent's MEMORY.md is the highest-quality memory source — curated directly by the agent during heartbeats. memory_sync.py syncs it to mem0 long-term memory every day at UTC 01:00, with hash-based dedup to avoid redundant LLM calls.

This is the fastest path: important decisions and lessons reach long-term memory the same day, without waiting for the 7-day archive cycle.

Path 2 — auto_dream.py / AutoDream (daily UTC 02:00)

Runs two steps each night:

Step 1: Reads yesterday's complete diary and writes it to mem0 with infer=True (no run_id) — directly into long-term memory with full-day context for high-quality extraction.
Step 2: For each 7-day-old short-term memory, calls mem0.add(infer=True, no run_id). mem0's LLM compares the memory against existing long-term memories and returns one of four decisions:
- ADD — new knowledge → written as new long-term entry
- UPDATE — overlaps with existing → merged/updated
- DELETE — redundant or contradicted → write skipped
- NONE — already fully covered → write skipped
Regardless of the decision, the original short-term entry is always deleted after processing.

This leverages mem0's native intelligence instead of hand-written semantic search, eliminating thousands of redundant Bedrock API calls per run.

Path 3 — Agent explicit write (on-demand)

Agents write directly to long-term memory by omitting run_id:

bash

python3 cli.py add --user boss --agent agent1 \
  --text "Decided to use S3 Vectors as the primary vector store" \
  --metadata '{"category":"decision"}'

The `run_id` mechanism

run_id is mem0's native per-run isolation key. We repurpose it as a date-scoped namespace:

run_id = "2026-03-27"   →  short-term (today's entries)
run_id = absent          →  long-term  (permanent)

Deduplication boundary: `run_id` scoping

This is the core design insight behind the two-tier system.

When mem0 receives a write with infer=True, it runs a semantic dedup search to decide ADD / UPDATE / DELETE / NONE. The search scope is bounded by run_id:

Write	Dedup scope	Effect
`auto_digest --today` (with `run_id=YYYY-MM-DD`)	Only same-day entries	Today's short-term memories are stored with fact extraction (infer=True); mem0 deduplicates at write time
`auto_dream` consolidate (no `run_id`)	All long-term entries (no run_id)	7-day-old short-term memories globally dedup against entire long-term history

This boundary has two important consequences:

1. Short-term writes are safe and cheap.auto_digest runs every 15 minutes. Because it writes with infer=True, mem0 runs fact extraction and dedup at write time — keeping costs minimal and writes fast.

2. Global dedup happens exactly once, at promotion time. When auto_dream promotes a short-term memory to long-term (re-add with no run_id), mem0 searches across the entire long-term store. This is the moment where redundant, updated, or already-covered knowledge gets merged. The result: long-term memory stays compact and non-redundant, even after months of daily operation.

In practice, we observed 1,800 short-term entries consolidating down to ~78 long-term entries after the first few auto_dream cycles — a ~23× compression ratio, retaining the distilled knowledge while eliminating noise.

Design Philosophy

Diary-to-mem0 pipeline

auto_digest.py --today (every 15 min, incremental)

Runs every 15 minutes, reading only new diary content since the last run. Sends section-aligned batches (up to ~50KB each, aligned to ## diary section boundaries) to mem0 with infer=True and a dedicated DIGEST_EXTRACTION_PROMPT — a custom prompt that preserves technical identifiers (project names, cluster IDs, service names, port numbers, paths), performance data, work progress, key decisions, and pitfalls. Extraction threshold is 2000 bytes. Offset is saved after each successful batch — if the process is interrupted, the next run picks up where it left off.

This provides real-time cross-session memory: conversations from the last ~15 minutes are available for retrieval in other sessions of the same agent.

auto_dream.py Step 1 (UTC 02:00, yesterday diary → long-term)

Runs once per day. Reads yesterday's complete diary and writes it to mem0 with infer=True (no run_id) — directly as long-term memory. With the full day's context available, mem0 produces higher-quality, deduplicated memories.

This provides high-quality nightly long-term memory from the full day's context.

Why the openclaw-plugin writes diary files, not mem0

The openclaw-plugin's agent_end hook writes conversation content to diary files rather than directly to mem0. This is a deliberate separation of concerns:

Plugin → diary files only (fast, no external API calls, no LLM cost)
auto_digest --today → mem0 writes (rate-controlled, section-aligned 50KB batches with sleep between)

Previously, session_snapshot.py polled OpenClaw's session API every 5 minutes and wrote to diary files. The new agent_end hook approach is superior: it fires in real-time after each conversation turn, eliminating the 5-minute polling delay and the need for a separate polling process. Diary entries are routed by agentId to each agent's workspace (~/.openclaw/workspace-{agentId}/memory/YYYY-MM-DD.md).

Why MEMORY.md sync is a separate path

MEMORY.md is maintained by agents themselves during heartbeats — it's the distilled, curated essence of what the agent has learned. This is qualitatively different from diary-extracted short-term memories.

Routing MEMORY.md directly to long-term memory (bypassing the 7-day short-term → archive cycle) ensures that explicitly curated knowledge is available immediately in subsequent sessions.

Architecture ​

Deployment Architecture ​

Component Responsibilities ​

Score Normalization Layer ​

Pipeline Timeline (UTC) ​

Memory Tiering: Who Decides Long vs. Short-Term? ​

Three paths to long-term memory ​

The run_id mechanism ​

Deduplication boundary: run_id scoping ​

Design Philosophy ​

Diary-to-mem0 pipeline ​

Why the openclaw-plugin writes diary files, not mem0 ​

Why MEMORY.md sync is a separate path ​