Signature
← Back to Overview

MAXIM

Memory Systems

How Maxim Remembers, Predicts, and Learns

Memory in biological systems isn't a filing cabinet. It's not even a database. It's a dynamic, reconstructive process where anatomically distinct brain regions collaborate to store, index, and retrieve experiences. The hippocampus, SCN, nucleus accumbens, and entorhinal cortex are separate structures scattered across the brain—hypothalamus, ventral striatum, medial temporal lobe—connected by neural pathways that let them work as complementary partners. Maxim mirrors this: four independent subsystems, each in its own package, coordinated by the MemoryHub.

Three Memory Layers

🧬 Biological Inspiration

The brain separates memory into episodic (hippocampus — personal experiences), semantic (anterior temporal lobe — general knowledge), and procedural (cerebellum — motor skills). Damage to the ATL causes semantic dementia: patients can describe their wedding day (episodic) but can't explain what a "wedding" is (semantic). The two systems are anatomically distinct but deeply interconnected.

Maxim implements three memory layers, each handling a different kind of knowledge:

The MemoryLayer Protocol

All three layers implement the same abstract protocol, enabling the MemoryHub and CrossLayerGraph to work with any layer uniformly:

MemoryLayer ABC class MemoryLayer(ABC): layer_name → str # "hippocampus", "atl", "angular_gyrus" store(record) → str # Store record, return ID get(record_id) → Record # Retrieve by ID remove(record_id) # Delete + cleanup edges recall(limit, **filters) # Filtered retrieval recall_associated(seed_ids) # Spreading activation recall graph → DependencyGraph # Internal associative graph save(path) / load(path) # Persistence consolidate(**kw) # Compress, decay, prune register_capture_callback(cb) # Notify on new record register_deletion_callback(cb) # Notify on removal stats() / __len__ / __iter__ # Inspection

All three layers share a common base record type (MemoryRecord) with fields for ID, timestamps, access tracking, and long-term status. Each layer extends this with domain-specific fields.

Layer Record Type Key Fields Compressed Form
Hippocampus EpisodicMemory perception, context, decision, action, outcome, run_id CompressedMemory
ATL SemanticMemory name, definition, category, properties, provenance, confidence CompressedSemantic
Angular Gyrus MathRecord name, category, domain, verbal, code, confidence CompressedMathRecord

The Hippocampus: Episodic Memory

🧬 Biological Inspiration

The hippocampus in mammals is crucial for forming new episodic memories, those rich, contextual records of "what happened, where, and when." Damage to it famously prevents forming new long-term memories while leaving older ones intact.

Maxim's Hippocampus stores EpisodicMemory objects, each capturing a complete cycle:

Flow Perception → Decision → Action → Outcome

Not every moment gets recorded. The system uses selective capture to avoid memory bloat:

  • User interactions - Always recorded (humans are important)
  • High novelty - New situations worth remembering (threshold: 0.7)
  • High salience - Emotionally or contextually significant events
  • Goal changes - Transitions in what the robot is trying to do
  • Failures - Mistakes are valuable teachers
  • Periodic checkpoints - Regular snapshots for continuity

Memories are indexed by multiple keys (goal, tool, object, person, success, mode) enabling O(1) retrieval. Need all memories involving "coffee mug"? Instant lookup.

The Associative Graph: Recall-Triggered Connections

🧬 Biological Inspiration

Each memory in the brain is physically encoded as an engram—a sparse ensemble of hippocampal neurons whose synaptic wiring pattern is the memory. When a new engram forms, it reactivates overlapping neurons from existing engrams, creating physical bridges between memories. Recalling one engram later propagates activation through these shared neurons and "lights up" linked ones—this is why the smell of coffee can trigger a memory of your grandmother's kitchen.

Maxim's Hippocampus builds an associative graph where memories are nodes and recall-triggered connections are edges. The key mechanism: when a new memory is captured, the system automatically recalls similar existing memories and forms bidirectional edges between them—mirroring how biological engram co-allocation creates associative links during encoding.

From Engrams to Edges

Pioneering work by Josselyn and Bhatt showed that individual engram cells can be tagged, reactivated with optogenetics, and even artificially linked—activating two engrams simultaneously creates a new association, exactly as if the animal had experienced both events together. The strength of biological engram links depends on how many neurons two ensembles share, which is governed by neuronal excitability at encoding time, perceptual similarity, and top-down goal states. Maxim's edge weight formula maps directly onto these mechanisms:

Biological Mechanism Maxim Analogue Weight
Pattern overlap — engrams for similar percepts recruit overlapping neural populations Shared detected objects and people between two memories 60%
Goal-state modulation — prefrontal cortex biases which neurons are excitable, linking goal-relevant memories preferentially Matching or overlapping active goals (word-level Jaccard) 25%
Co-allocation — neuronal excitability lingers after encoding, so memories formed close in time share more neurons Closer in time = stronger link (decays over hours) 15%
Subthreshold propagation — recalling an engram sends graded activation through shared neurons to linked engrams BFS spreading activation with exponential decay (0.5 per hop, max 3 hops)
Synaptic homeostasis — during sleep, weak synaptic links are pruned while strong ones consolidate Sleep cycle prunes weakly connected memories; well-linked ones score higher in retention
Edge Formation During Capture 1. New memory captured: "found cup on kitchen table" 2. recall_similar() fires automatically 3. Returns: "picked up cup yesterday", "cleaned table last week" 4. Bidirectional edges formed: found_cup ←→ picked_up_cup (weight: 0.81) found_cup ←→ cleaned_table (weight: 0.75)

Spreading Activation: Multi-Hop Recall

The real power emerges during recall. Just as recalling a biological engram sends subthreshold activation through shared neurons to linked ensembles, Maxim uses spreading activation to propagate signals through the graph. Activation flows from seed memories through edges, decaying at each hop:

Spreading Activation Example Query: "make coffee" │ ▼ direct recall (activation: 1.0) ┌──────────────────┐ │ "made coffee at │ │ 9am yesterday" │ └────────┬─────────┘ │ edge (weight: 0.81) ▼ activation: 1.0 × 0.5 × 0.81 = 0.41 ┌──────────────────┐ │ "found cup on │──── This memory has NO "coffee" │ kitchen table" │ in its index keys, but is └────────┬─────────┘ reachable through association │ edge (weight: 0.75) ▼ activation: 0.41 × 0.5 × 0.75 = 0.15 ┌──────────────────┐ │ "cleaned kitchen │──── Two hops away from "coffee" │ table last week" │ but still contextually relevant └──────────────────┘

This enables context-bridging recall. A memory about finding a cup (which was recalled when the coffee memory was formed) becomes reachable from a "make coffee" query, even though "cup" and "coffee" share no direct index keys. The graph bridges contexts that flat recall cannot.

Bridge Integration

Every memory bridge in the system is enriched with associative recall, expanding their knowledge beyond direct filter matches:

  • Planning Bridge — Finds successful plan templates not just for the exact goal, but for associatively related goals. "Navigate to kitchen" recalls templates from "fetch cup from kitchen."
  • Salience Bridge — Builds richer interaction history by pulling in associated memories, boosting salience for objects that appeared in related successful contexts.
  • Spatial Bridge — Expands location priors with associated memories, learning that "cups are often on the kitchen table" even from memories primarily about cooking.
  • Fear Bridge — Contextualizes risk assessments with associative history—if similar past contexts led to failures, risk thresholds adjust accordingly.
  • Escalation Bridge — Informs when to ask for human help by checking whether associated memories show high failure rates in similar contexts.

Lifecycle

The graph is fully integrated with the memory lifecycle:

  • Capture: Edges formed automatically (configurable: association_limit=5, association_threshold=0.5)
  • Persistence: Graph serialized alongside memories (v3.0 format, backward-compatible)
  • Sleep: When memories are pruned or compressed, their graph edges are cleaned up automatically. Well-connected memories score higher in retention.
  • Consistency: Graph nodes are validated against the memory store and orphans are repaired

The SCN: Temporal Rhythm Indexing

🧬 Biological Inspiration

The Suprachiasmatic Nucleus (SCN) sits in the hypothalamus, not the hippocampus—it's a separate brain structure that serves as the brain's master clock. This tiny cluster of ~20,000 neurons orchestrates circadian rhythms across the entire body. It communicates with the hippocampus and other memory systems via neural pathways, enabling temporal context for memory formation and recall. It's why jet lag hurts and why you get hungry at the same time each day.

Maxim's SCN provides temporal indexing across multiple timescales:

Timescale Bins Use Case
Hourly 24 bins "What usually happens at 9 AM?"
Daily 7 bins "What's different on Mondays?"
Weekly 4 bins "First week of month patterns"
Monthly 12 bins "Seasonal patterns"

This enables queries like "What typically happens around this time on weekday mornings?" with minimal computational cost. The memory footprint is remarkably small: 10,000 memories require only ~500KB of index storage.

Coupled Oscillator Network

The biological SCN isn't just a clock—it's a network of ~20,000 coupled oscillators synchronized via a coupling matrix. "Monday mornings" isn't the intersection of a Monday bin and a 9am bin. It's an emergent rhythm from learned coupling between circadian and weekly oscillators.

Maxim's SCN embeds an optional Kuramoto-inspired coupled oscillator network alongside the existing bin indices. Four oscillators—circadian (24h), weekly (7d), monthly (30d), and annual (365d)—evolve according to coupled phase dynamics:

i/dt = ωi + (K/N) Σj W[i][j] · sin(θj − θi)

The coupling matrix W learns via Hebbian learning: when two oscillators are co-active (similar phases), their coupling strengthens. This is how "Monday mornings" emerge—repeated observations where circadian and weekly oscillators activate together strengthen their link. Over time, the system develops temporal expectations that go beyond simple bin lookups.

Capability What It Does
Phase prediction Forward-simulate oscillator phases to predict when patterns will recur
Coherence detection Kuramoto order parameter r ∈ [0,1] measures how synchronized the system is
Anomaly scoring Circular distance between predicted and observed phases flags temporally unusual events
Coupling analysis Eigenvalues of the coupling matrix reveal dominant rhythm structures

The oscillator is fully optional—existing bin-based methods remain unchanged, and the oscillator only activates when enabled. This means zero breaking changes for existing consumers. Read the full oscillator math →

The Nucleus Accumbens: Reward Prediction

🧬 Biological Inspiration

The Nucleus Accumbens (NAc) sits in the ventral striatum—a deep brain structure distinct from the hippocampus, though heavily interconnected with it. The NAc is central to reward processing and motivation, receiving dopaminergic input from the ventral tegmental area and contextual signals from the hippocampus and prefrontal cortex. It learns to predict outcomes based on prior experience, essentially asking: "What happened last time I did this?"

Maxim's NAc learns causal links between events and outcomes:

Data structure Event (e.g., "internet_search") → Outcome (e.g., "success_with_results") ↓ CausalLink: - event_signature: hash of action + parameters - outcome_signature: hash of result type - valence: POSITIVE | NEUTRAL | NEGATIVE - strength: 0.0 - 1.0 - confidence: reliability of prediction - observation_count: how many times seen - context_conditions: when this applies

The learning algorithm is Rescorla-Wagner, the same mathematical model used in behavioral psychology:

ΔV = α(λ - V)

Where α is learning rate, λ is the actual outcome, and V is the current prediction. This produces smooth, asymptotic learning without oscillation.

The key insight: before executing any action, Maxim can predict its likely outcome. This enables proactive rather than purely reactive decision-making.

The Entorhinal Cortex: Similarity Matching

🧬 Biological Inspiration

The Entorhinal Cortex (EC) is a cortical region in the medial temporal lobe, anatomically adjacent to but distinct from the hippocampus. It serves as the primary gateway between the hippocampus and neocortex, supporting pattern completion and spatial navigation. Grid cells here create abstract representations that generalize across specific experiences. The EC and hippocampus work as complementary partners: the EC provides the similarity-matching "address system" while the hippocampus stores the actual episodic content.

Maxim's EC enables similarity queries: "Find memories similar to this situation." This is crucial because exact matches are rare. You want to find relevant past experience even when details differ.

The implementation uses Locality-Sensitive Hashing (LSH)—a technique that groups similar items into the same "bucket" for fast lookup—enabling approximate nearest-neighbor search in constant time:

Example Current situation: "find the red cup in the kitchen" ↓ Situation Signature: [semantic, structural, temporal features] ↓ LSH lookup: ~10ms regardless of memory size ↓ Returns: "find the blue mug in the kitchen" (last week) "find the water bottle in the kitchen" (yesterday)

Optional neural embeddings (SentenceTransformer) provide richer semantic similarity for Phase 4 deployments.

Beyond Episodes: The ATL

🧬 Biological Inspiration

The anterior temporal lobe is the brain's "semantic hub" — where concepts are stored independently of the episodes that formed them. You know that dogs bark, have four legs, and are pets, without remembering the specific experiences that taught you this. The ATL integrates information from multiple sensory modalities into amodal concept representations.

Episodic memory records what happened. Semantic memory captures what things mean. Humans don't remember the exact moment they learned that fire is hot — they just know it. That knowledge was distilled from many episodes into a concept. Maxim implements this same progression: repeated experiences get promoted from episodic memories into stable semantic concepts.

Maxim's ATL stores SemanticMemory objects — concepts with names, definitions, properties, and typed relationships to other concepts. It mirrors the Hippocampus architecture (context indexing, associative graph, consolidation) but with slower decay and higher stability.

SemanticMemory Record

Data Structure SemanticMemory(MemoryRecord): name: str # "coffee mug", "navigate_tool_reliability" definition: str # Natural language definition category: str # "object", "person", "action", "causal_pattern", "operational_pattern" properties: dict # {"color": "blue", "location": "kitchen"} provenance: enum # How this concept was formed source_episode_ids: list # Hippocampus episodes that contributed confidence: float # min(0.99, 0.5 + 0.1 × √reinforcement_count) reinforcement_count: int # How many episodes confirmed this embedding_text: str # Text used for EC similarity matching

Concept Provenance

Every concept knows how it was formed, enabling the system to weight newer, less-verified concepts differently from well-established ones:

EPISODIC_CONSOLIDATION

Extracted from repeated episodes via NAc reward signals. The most common path — "I've seen this pattern enough times to call it knowledge."

AGENT_INFERENCE

Proposed by the StatisticianAgent from confirmed operational patterns. "The data shows this is a real trend, not noise."

DIRECT_INGESTION

From RAG / document ingestion. Knowledge imported directly without episodic formation.

HYBRID

Multiple sources contributed. Both episodic observation and statistical confirmation.

ATL-Specific Features

  • find_or_create — Deduplication by name similarity. Before creating "coffee_mug", checks if "coffee mug" already exists. Returns (id, was_created) tuple.
  • recall_similar — Semantic similarity search via EC embeddings. "Do I already know about this?" Used by the promoter to avoid concept duplication.
  • Context indexing — O(1) lookup by name, category, or property key/value. "All concepts with category=object" is instant.
  • Consolidation — Same pattern as Hippocampus but with slower decay (30-day max age vs 7-day). Concepts are more stable than episodes.

Typed Relationships

Concepts in isolation aren't knowledge — knowledge is the relationships between concepts. The ATL uses a Semantics manager that wraps the DependencyGraph with typed, directional edges.

Pre-Registered Relationship Types

Type Symmetric? Example
IS_A No "coffee_mug" IS_A "container"
HAS_PART No "robot_arm" HAS_PART "gripper"
CAUSES No "grasp_action" CAUSES "object_held"
TRENDS_WITH Yes "navigate_failures" TRENDS_WITH "goal_latency"
CORRELATES_WITH Yes Confirmed by AG cross-metric correlation
PHASE_LOCKED_TO No "success_rate" PHASE_LOCKED_TO "circadian" (metadata: phase, coupling)
PREDICTS No "morning_activity" PREDICTS "high_success"

The RelationshipRegistry also supports runtime extension — agents can propose new relationship types during operation. The registry tracks whether each type is built-in or agent-proposed, and persists across sessions.

Cross-Layer Graph

Each memory layer has its own internal associative graph. The CrossLayerGraph connects records across layers, enabling queries like "starting from this episode, what concepts and math patterns are related?"

Cross-Layer Architecture Hippocampus ATL Angular Gyrus (episodes) (concepts) (math/patterns) ┌───────────┐ ┌───────────┐ ┌───────────┐ │ episode_1 ├─ASSOCIATES──┤ │ │ pattern_1 │ │ episode_2 ├─ASSOCIATES──┤ concept_A ├──IS_A───────┤ │ │ episode_3 │ │ concept_B ├──CAUSES─────┤ pattern_2 │ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │ │ │ │ ╔══════════════════════╧══════════════════════╗ │ ├────║ CrossLayerGraph ║───┤ │ ║ ║ │ │ ║ DERIVED_FROM: ATL concept ← episodes ║ │ │ ║ INSTANCE_OF: episode → ATL concept ║ │ │ ║ STATISTICALLY_CONFIRMS: AG pattern → ATL ║ │ │ ║ QUANTIFIES: AG record → ATL concept ║ │ │ ║ TEMPORALLY_CORRELATES: any ↔ any ║ │ │ ║ COMPUTED_FROM: AG ← source data ║ │ │ ║ INFORMS: any → any ║ │ │ ╚═════════════════════════════════════════════╝ │ │ │ └──────────────────────────────────────────────────────┘

Cross-Layer Spreading Activation

The most powerful feature is cross-layer spreading activation. Starting from any record in any layer, activation spreads:

  1. Follow intra-layer edges via the layer's internal associative graph
  2. Follow cross-layer edges to records in other layers
  3. Recurse with exponential decay (default 0.5 per hop) until threshold or max depth
  4. Return activations grouped by layer — { "hippocampus": [(id, score)], "atl": [...], "angular_gyrus": [...] }

Example Traversal

Starting from a Hippocampus episode where tool_navigate failed:

  1. Intra-layer: Find similar failure episodes (Hippocampus associative graph)
  2. Cross-layer INSTANCE_OF → ATL concept "navigate_tool_reliability" (confidence=0.73)
  3. ATL IS_A → ATL concept "tool_capabilities"
  4. Cross-layer STATISTICALLY_CONFIRMS → AG PATTERN record (R²=0.73, slope=-0.02/day)
  5. Cross-layer TEMPORALLY_CORRELATES → AG record linking decline to circadian phase 0.8 (evening)

The Promotion Pipeline

🧬 Biological Inspiration

Memory consolidation during sleep transforms labile hippocampal traces into stable neocortical representations. The NAc (reward system) plays a gating role — rewarding experiences are preferentially consolidated. This is why emotionally significant events become lasting knowledge while mundane details fade.

The SemanticPromoter orchestrates the progression from episodic observations to stable semantic knowledge. It scans multiple PromotionSources for qualifying patterns, filters noise, and creates ATL concepts with full cross-layer traceability.

Promotion Pipeline PromotionSources IPS Gate SemanticPromoter ──────────────── ──────── ──────────────── NAc (reward patterns) ─┐ ├──→ IPS randomness ──→ Deduplicate (ATL recall_similar) StatisticianAgent ─┤ quality gate Extract recurring elements (confirmed patterns) │ (reject noise) Create/reinforce concept │ Form relationships Future sources... ─┘ Create cross-layer edges │ ↓ ATL SemanticMemory + CrossLayer edges + Relationship types

PromotionSource Protocol

Any system that detects patterns can be a promotion source. It just needs one method:

get_promotion_candidates(min_confidence=0.6, min_observations=3) → list[PromotionCandidate] # Each candidate provides: PromotionCandidate: pattern_name: str # "grasp → success", "stat:tool:navigate:success" category: str # "causal_pattern", "operational_pattern" confidence: float # Source system's confidence source_memory_ids: list[str] # Episodic IDs that contributed metadata: dict # Source-specific extras

NAc Path (Causal Patterns)

NAc tracks event→outcome links via Rescorla-Wagner learning. When a causal link reaches sufficient confidence and observation count, it becomes a promotion candidate.

Example: "grasp + coffee_mug → success" observed 5 times with confidence 0.8 → promote to ATL concept "grasping coffee mugs is reliable" with DERIVED_FROM edges to source episodes.

StatisticianAgent Path (Operational Patterns)

Confirmed patterns from the IPS→AG escalation pipeline become candidates. No episodic IDs needed — the AG MathRecord provides the evidence.

Example: "tool_navigate declining (R²=0.73)" → promote to ATL concept with provenance=AGENT_INFERENCE and STATISTICALLY_CONFIRMS cross-layer edge to the AG PATTERN record.

IPS Randomness Quality Gate

Not every NAc pattern deserves permanent semantic status. The IPS provides a lightweight noise filter before the heavyweight promotion machinery runs:

  • NAc candidates: IPS assesses the randomness of observation timing. If events occurred at random intervals (no temporal structure), they're likely coincidental — rejected.
  • StatisticianAgent candidates: Already passed IPS/AG assessment during the pattern detection FSM — just checks confidence ≥ 0.4 (no re-assessment needed).
  • Small samples (< 8 observations): Gate allows through conservatively — too few data points to assess randomness.

When Promotion Runs

Promotion is part of the mandatory consolidation cycle during MemoryHub.on_session_end():

Consolidation Ordering 1. Hippocampus sleep (SCN-aware temporal consolidation + auto-save) 2. Promote (NAc + StatAgent patterns → ATL concepts, IPS filters noise) 3. Consolidate (ATL + Angular Gyrus consolidation — compress, decay, prune) 4. Persist (save ATL + Angular Gyrus + cross-layer graph) Ordering is critical: promotion runs BEFORE consolidation so source episodes still exist when cross-layer edges form.

Knowledge in the Agent Loop

The point of all this is to make the agent smarter. Knowledge from ATL and Angular Gyrus flows into the LLM's reasoning context via StructuredContext.knowledge_context:

Merged Knowledge Context (what the LLM sees) # ATL concepts (semantic knowledge) - "coffee_mug" (object, confidence=0.87): A ceramic container typically found in kitchens. Relationships: IS_A container, PROPERTY_OF kitchen_area # AG patterns (statistical knowledge) - "stat:tool:navigate:success" (math:PATTERN, confidence=0.73): Navigate tool showing declining success rate (R²=0.73, slope=-0.02/day). # Ranked by relevance, capped at 8 entries # The LLM sees one unified "what do I know?" section

This merging strategy ensures the LLM gets both semantic knowledge ("coffee mugs are containers in kitchens") and statistical intelligence ("the navigate tool is declining") in a single ranked list, regardless of which memory layer the knowledge lives in.

Memory Consolidation: Sleep

Biological memory consolidation happens during sleep. Maxim implements an analogous process during idle time or session end:

  • Access-based retention: Frequently accessed memories are preserved
  • Graph-aware retention: Well-connected memories in the associative graph score higher for retention—mirroring how strongly linked engrams resist synaptic homeostasis during biological sleep
  • Compression: Old memories shrink from ~2.5KB to ~200 bytes, with their live edge count snapshotted for future retention scoring
  • Pruning: Memories not accessed in one week are removed, and their associative graph edges are cleaned up automatically
  • Protection: High-value memories (user interactions, successes) are never pruned
  • Temporal coverage: SCN-aware consolidation maintains representation across time periods

Memory System Integration

            ┌─────────────┐
            │   Percept   │
            └──────┬──────┘
                   │
    ┌──────────────┼──────────────┐
    ▼              ▼              ▼
┌───────┐  ┌─────────────┐  ┌────────┐
│  SCN  │  │ Hippocampus │  │   EC   │
│(when) │  │   (what)    │  │(like?) │
└───┬───┘  └──────┬──────┘  └───┬────┘
    │          ┌───┴───┐        │
    │          ▼       ▼        │
    │     ┌────────────────┐    │
    │     │ Associative    │    │
    │     │ Graph (linked?)│    │
    │     └───────┬────────┘    │
    │             │             │
    └──────┬──────┴──────┬──────┘
           ▼             ▼
      ┌────────────────────────┐
      │    NAc (predict)       │
      │    What outcome?       │
      └───────────┬────────────┘
                  ▼
           ┌───────────┐
           │  Decision │
           └───────────┘