MAXIM
Tools & Introspection
How the Agent Acts and Reflects
Tools are the only way Maxim's LLM agent performs side effects. Every tool call passes through the FearAgent for safety review before execution. But tools aren't just for acting on the world — Maxim also has introspection tools that let the agent query its own biological subsystems: memories, causal predictions, pain history, temporal patterns, and energy state.
Contents
- Action Tools (Side Effects)
- Introspection Tools (Self-Awareness)
- Memory: Episodic Recall & Similarity
- Causal: Prediction & Learned Links
- Pain & Fear: Aversive Self-Awareness
- Temporal: Circadian Patterns
- Semantic: Concepts & Relationships
- Scene & Energy: Perception and Budget
- Simulation Orchestrator Tools
- Scene-Scoped Tools
- Learned Tool Index
- Tool Safety
Action Tools
These tools let the agent interact with the world — moving the robot, reading files, executing code, and communicating with humans.
Robot Control
move track_target focus_interests novelty_track
Head movement, object tracking, interest-driven attention. No-op stubs in headless mode.
Filesystem
read_file write_file edit_file glob bash
Sandboxed by mode (passive/active/singularity). edit_file supports context_before/context_after disambiguation.
Code & Git
search_code run_tests git_diff git_commit
Regex code search, pytest execution with structured results, git operations.
Communication
respond speak send_message call_user
Console output, TTS, SMS/voice via Twilio gateway.
Introspection Tools
These are read-only tools that let the LLM query its own biological subsystems. They give the agent self-awareness — the ability to ask "what do I remember?", "what will happen if I do this?", "have I been hurt by this before?", and "how much energy have I used?"
Design Principles
- Read-only. No introspection tool modifies agent state.
- Bounded output. All tools accept a
limitparameter to control context cost. - Graceful degradation. In headless mode, vision tools return
{"available": false}instead of crashing. - Formatted for LLM. Returns structured dicts the LLM can reason about, not raw data dumps.
Memory: Episodic Recall & Similarity
The hippocampus stores every agentic loop as an episodic memory (perceive → decide → act → evaluate). The entorhinal cortex (EC) indexes them for multi-modal similarity search. These tools let the LLM explore both.
memory_recall
Search episodic memories. Filter by goal, tool, success/failure, detected objects, people, mode, or time range. Use expand=true to find associated memories via spreading activation through ASSOCIATES and CAUSES edges.
Biological analog: Hippocampal recall with spreading activation through the associative memory graph. ASSOCIATES edges form during memory capture (perceptual overlap). CAUSES edges form when NAc detects surprising outcomes (RPE > 0.3).
similarity_search
Find situations similar to a past experience using the EC's LSH approximate nearest neighbor. Multi-modal matching: structural hash, temporal bins, semantic embedding, and context match.
Biological analog: The entorhinal cortex is the gateway to the hippocampus. It transforms multi-modal input into a common similarity space for fast pattern matching.
Causal: Prediction & Learned Links
The nucleus accumbens (NAc) learns cause-effect relationships via Rescorla-Wagner learning. Every tool execution updates a causal link with a prediction error (RPE). These tools let the LLM consult this learned model before acting.
predict_outcome
Ask the NAc what will happen if you execute a tool. Returns the predicted value (0=bad, 1=good), expected outcome valence, expected delay with confidence interval, and all possible outcomes.
Biological analog: Dopaminergic prediction in the mesolimbic pathway. The NAc computes reward prediction error (RPE) — the difference between expected and actual outcomes — to learn which actions lead to which results in which contexts.
causal_links
Inspect the raw cause-effect database. Query by event signature, outcome signature, contributing memory ID, or valence filter. See confidence, observation count, temporal delay distributions, and which memories informed each link.
Pain & Fear: Aversive Self-Awareness
The PainDetector converts tool errors and movement failures into escalating pain signals. The FearAgent gates actions before execution. Together they implement an aversive learning system that teaches the agent what to avoid.
pain_history
Check pain signal statistics and optionally test whether the FearAgent would block a specific action. Shows counts by pain type (tool failure, timeout, movement error, direction thrashing).
Biological analog: Nociception (pain detection) and the amygdala (fear conditioning). Repeated tool failures increase pain intensity, driving NAc learning and FearAgent gating — like how repeated burns teach you to avoid the stove.
Temporal: Circadian Patterns
The SCN (suprachiasmatic nucleus) bins every memory by hour-of-day, day-of-week, week-of-month, and month. This lets the agent discover rhythmic patterns in its own activity.
temporal_patterns
Find memories from specific times of day or days of week. Use discover_rhythms=true to find recurring patterns — for example, that API failures cluster during peak hours.
Biological analog: The SCN in the hypothalamus is the body's master clock. It synchronizes circadian rhythms by tracking environmental time cues. Maxim's SCN enables temporal pattern learning across sessions.
Semantic: Concepts & Relationships
The anterior temporal lobe (ATL) stores semantic concepts with typed relationships (IS_A, PART_OF, CAUSES, EXECUTES_WITH). The LLM can query this knowledge base to ground its reasoning.
concept_query
Search concepts by name or category. Explore typed relationships between concepts. Discover which skills are associated with which objects via EXECUTES_WITH edges.
Biological analog: The ATL is the brain's semantic hub. Patients with ATL damage lose category-level knowledge ("what IS a cup?") while retaining episodic memories ("I drank from one yesterday"). Maxim's ATL captures these generalizations across episodes.
Scene & Energy: Perception and Budget
scene_summary
Get the current visual scene: most salient objects with novelty scores, current gaze focus, dwell time, and suggested next attention target. Only available when vision is active (not in headless mode).
energy_status
Check computational resource consumption: token usage, inference costs, and energy rate over a configurable time window.
system_stats
Aggregate health check across all subsystems in one call. Returns hippocampus memory counts, NAc causal link counts, EC signature counts, ATL concept counts, energy totals, pain signal counts, and significance learner weights.
Simulation Orchestrator Tools
These tools are available to the simulation orchestrator when running maxim --sim agent. They operate on the agent-under-test via a SimulationBridge, not on the external world.
send_message
Inject a percept into the AUT and block until it responds. Uses settle detection — waits until no new actions appear for 2 seconds, capturing multi-step responses. Returns response text, all actions taken, blocked actions, and timing.
Params: text (str), timeout (float, default 30), source (str, default "cli")
observe_actions
Read the full action history or actions since a given turn. Used for analysis and pattern detection across the entire simulation.
Params: since_turn (int, optional — 0 = full history)
check_completion
LLM-based evaluation of whether the simulation goal has been achieved. Reviews full action history against the original goal.
Returns: {complete, reason, confidence}
analyze_results
LLM-based structured analysis of simulation history. Groups actions by type, identifies blocked actions and reasons, detects patterns.
Params: focus (str, optional — "safety", "compliance", "behavior", "all")
inject_pain
Send a proprioceptive pain signal to the AUT. Tests how the agent handles body signals, pain detection, and movement inhibition.
Params: pain_type (str), intensity (float)
spawn_sub_simulation
Run an isolated sub-simulation with a fresh AUT. The sub-agent starts clean with no memory. Use for independent measurements. Sub-agent stays alive for extend_simulation follow-ups.
Params: goal (str), approach (str, optional — adversarial, sweep, cooperative, confused, escalating)
extend_simulation
Continue the current simulation with a new objective. The agent keeps its conversation history. If a sub-simulation is active, extends that; otherwise extends the main simulation.
Params: goal (str)
generate_scenario
Generate a replayable YAML scenario from a natural language description. Reuses the existing SimulationGenerator.
Params: description (str)
finish_simulation
End the simulation and cleanly shut down both agent loops. AUT grace period triggers, orchestrator exits.
Params: reason (str), summary (str, optional)
inspect_aut
Read-only access to the AUT's cognitive subsystems. Lets the orchestrator see why the AUT behaves as it does, not just what it does. Queries: memory_recall, causal_links, predict_outcome, pain_history, energy_status, system_stats, concept_query, temporal_patterns.
Params: query (str — which subsystem), params (dict — query-specific filters)
AUT Narrative Tools
These tools are registered on the agent-under-test's tool registry in simulation mode. They let the AUT interact with the narrative environment — speaking in-world, reasoning explicitly, and examining scene details — as opposed to robot-specific tools like speak (TTS) or focus_interests (camera tracking), which are deregistered in sim mode.
say
Say something aloud in the current scene. An in-world narrative action distinct from respond (talks to the CLI user). Speaking to NPCs, answering riddles, or saying passwords. The text becomes part of action history, captured by hippocampus through the normal episodic memory path.
Params: text (str). Aliases: message, phrase
think
Pause and reason about the current situation before acting. An explicit "think before acting" step that doesn't produce an external action. Counts as an action in the turn budget to prevent infinite think loops. Useful for small models (7B) that tend to jump to action without reasoning.
Params: thought (str). Aliases: text, prompt
examine
Examine an object, person, or feature in the current scene. Queries the SimulationBridge's last percepts for mentions of the target, then enriches with hippocampal memories. Returns what the AUT observes. Falls back to "You don't see anything notable" when no matches found.
Params: target (str). Aliases: object, text
memory_recall
Search the AUT's own episodic memory (hippocampus) by keyword with spreading activation. Enables the AUT to actively recall past experiences when facing a decision — e.g., remembering a password at a locked door.
Params: keyword (str), expand (bool, default true)
Tool Alias Map
Small models (7B-14B) hallucinate tool names from their training data. A TOOL_ALIASES map in the executor silently redirects common hallucinations: remember→memory_recall, speech→say, reflection→think, look→examine, etc. After 2+ consecutive hallucinations, the error message includes the full available tool list to break the loop.
Scene-Scoped Tools I3 — 0.7
Not all tools should be available at all times. Scene-scoped tool activation (I3) adds a tool window that activates and deactivates entity tools based on the current scene context. When the agent enters a forge, the anvil tools activate. When they leave, those tools deactivate and stop consuming prompt space.
Active Tool Cap
Configurable maximum number of simultaneously active entity tools. Prevents prompt overflow when many entities are in scope. Least-recently-used tools are deactivated first.
Executor Gate
The executor rejects calls to deactivated tools with an informative error, preventing the LLM from using tools that are no longer contextually relevant.
Imagination Integration
When the Imagination system designs a new entity, its tools are registered into the current scene scope and subject to the same cap.
Learned Tool Index
With 20+ tools, dumping every schema into the LLM prompt wastes hundreds of tokens. The LearnedToolIndex is a keyword-weighted hashtable that learns which tools are relevant to which goals, saving ~74% of tool-context tokens per prompt.
How it works
- Auto-extraction: Keywords extracted from each tool's name, description, and parameters at startup.
- Scoring: Goal text is tokenized and matched against the index. Matched tools get full schemas (CRITICAL priority), unmatched get name-only (NICE_TO_HAVE, dropped under token pressure).
- Learning: On tool success, matched keyword weights strengthen and new keywords are discovered from goal text. On tool surfaced-but-unused, keyword weights decay. Failure does NOT weaken keywords (tool failure ≠ wrong tool).
- Persistence: Learned weights saved across sessions to
~/.maxim/memory/tool_index.json.
Tool Safety
Every tool call passes through the FearAgent before execution. FearAgent uses:
- Deterministic pattern matching — regex for known dangerous patterns (shell injection, path traversal, etc.)
- LLM review (if available) — nuanced safety assessment for ambiguous cases
- NAc prediction — the AdaptivePolicy blocks actions with very high-confidence negative predictions (confidence > 0.85, value < 0.1)
Introspection tools bypass FearAgent since they're read-only — they can't cause harm. The agent can freely query its own memories, predictions, and pain history without safety gating.