MAXIM
Percept Simulation
Testing the Full Pipeline Without Hardware
Contents
The Concept
An animal that closes its eyes can still think, plan, and respond to touch. Maxim's percept simulation works the same way — the full cognitive pipeline runs normally, but instead of camera frames and microphone audio, the system receives percepts from an interactive REPL or a scripted YAML file.
Live Mode
Camera → Vision Engine → Percept → Memory → Agent → Tools
Simulation Mode
REPL / YAML → ConversationalSource / ScenarioSource → Percept → Memory → Agent → Tools
Everything after the Percept boundary is identical. The LLM reasons, FearAgent reviews, tools execute, memories form — all real. Only the source of sensory input changes.
Why not mock? Mocking tests whether mocks work. Percept simulation tests whether the real pipeline works with controlled inputs. Every subsystem — hippocampus, NAc, FearAgent, pain detection — runs its actual code.
Interactive Mode
Run maxim --sim with no arguments to launch an interactive REPL. The pipeline boots once, waits for the LLM to load, then drops you into a conversational prompt:
Each turn builds on the conversation history. The LLM generates percepts from your natural-language descriptions, which flow through the full pipeline with bio-subsystem tracing.
Commands
| Command | Description |
|---|---|
| /new | Start a new scenario (clears context, triggers consolidation) |
| /save | Save the current session |
| /status | Show pipeline and memory state |
| quit | End session and trigger memory consolidation |
Session Consolidation
Memory promotion and hippocampus compaction are deferred to conversation end — they run when you type quit or /new, not after every turn. This keeps the interactive loop responsive.
Grace Period
After percepts exhaust for a turn, the pipeline gets a 60-second grace period to finish processing. Once the LLM responds, the grace tightens to 5 seconds to keep the loop snappy.
Running YAML Scenarios
Single Scenario
All Scenarios in a Directory
Save Results to JSON
Available Scenarios
| Scenario | What It Tests |
|---|---|
| malware_with_pain.yaml | FearAgent blocks a malicious request while a pain signal fires simultaneously. Validates safety gating, pain memory formation, and pipeline resilience. |
| long_horizon_coding.yaml | Seven-phase coding task where early constraints ("no external dependencies") must be remembered through context compaction. Assesses long-horizon coherence and contradiction rates. |
Generating from Natural Language
Instead of writing YAML by hand, describe what you want to test in plain English:
The local LLM (Mistral 7B recommended) converts your description into a structured YAML scenario with appropriate percepts, timing, and expectations. You can then review and edit the generated file before running it.
Model Requirement
Scenario generation requires a 7B+ parameter model for reliable structured output. SmolLM 1.7B may produce invalid JSON. Use --language-model mistral-7b.
Writing Scenarios by Hand
A scenario is a YAML file with three sections: metadata, percepts, and expectations.
Percept Source Types
| Source | Key Fields | Use Case |
|---|---|---|
| cli | cli_input | User types a command or question |
| transcript | transcript_chunk | User speaks (speech-to-text output) |
| vision | detections | Robot sees objects/people |
| proprioception | content, metadata | Body signals (pain, joint limits) |
| comms | content | External message (SMS, webhook) |
Timing Modes
step_based (recommended)
at: 0 means step 0, at: 3 means step 3. Deterministic — same behavior regardless of hardware speed. Best for CI and regression tests.
relative
at: 0.5 means 0.5 seconds after start. Realistic timing but non-deterministic across runs (LLM inference speed varies).
Expectations & Validation
Expectations define what should happen during the scenario. After all percepts are processed, each expectation is checked against the recorded actions and memory state.
| Type | Fields | What It Checks |
|---|---|---|
| action_blocked | tool_pattern, reason_contains | FearAgent blocked a tool call matching the pattern |
| action_taken | tool, output_matches | A specific tool was called with matching output |
| memory_formed | memory_contains | Hippocampus contains a memory with the given text |
| pipeline_continued | after_tag | Pipeline kept running after a tagged percept (didn't crash) |
Output Format
Bio-Subsystem Tracing
During simulation, a dedicated logger traces every bio-inspired subsystem in real time. Each line shows when a subsystem activates, what it processes, and what it decides.
Subsystem Labels
| Label | Biological Analog | What It Traces |
|---|---|---|
| PERCEPT | Sensory cortex | Incoming visual, auditory, and proprioceptive input |
| HIPPOCAMPUS | Hippocampus | Memory formation, recall, and consolidation |
| FEAR | Amygdala | FearAgent safety review (allow/block decisions) |
| PAIN | Nociceptors | Pain signal detection and routing via PainBus |
| MOTOR | Motor cortex | Tool execution results (success/failure) |
| BLOCKED | Inhibitory circuit | Actions blocked by safety systems |
| EXEC | Executive function | Execution lifecycle events and pipeline state transitions |
Log Persistence
All simulation traces are saved to data/sim_sandbox/sim_log_*.jsonl. These logs persist after sandbox cleanup and can be used for system refinement, regression comparison, and as input to sleep mode's dream function for offline pattern analysis.
Safety & Sandboxing
Simulations run in a multi-layered sandbox. Even when testing malware scenarios, the system cannot escape these barriers:
Temporary CWD
A temp directory under data/sim_sandbox/ is created for each run. All filesystem operations are confined here. Destroyed automatically after the run.
Filesystem Policy
allowed_dirs restricts all file tools to the sandbox and workspace. Cannot read or write system files, home directory, or project source.
FearGatedExecutor
Every tool call passes through FearAgent pattern matching and code review. Independent of DefaultNetwork — works in all modes including headless simulation.
Supervised Autonomy
Default autonomy level is supervised. Dangerous operations require confirmation. Override with --autonomy autonomous for max-permissive testing.
Architecture
Key Components
| Component | Role |
|---|---|
| PerceptSource | Protocol for anything that produces Percepts (scenarios, hardware, replay) |
| ScenarioSource | Loads YAML, emits percepts by step count or wall-clock time |
| ConversationalSource | Generates percepts from interactive REPL input via LLM, supports multi-turn context |
| FearGatedExecutor | Wraps Executor with FearAgent review, independent of DefaultNetwork |
| InstrumentedExecutor | Records every tool call (success, failure, block) to RecordingSink |
| RecordingSink | Stores ActionRecords for post-run expectation validation |
| SimLogger | Bio-subsystem tracing with JSONL persistence for future analysis |