Signature
← Back to Overview

MAXIM

Tools & Introspection

How the Agent Acts and Reflects

Tools are the only way Maxim's LLM agent performs side effects. Every tool call passes through the FearAgent for safety review before execution. But tools aren't just for acting on the world — Maxim also has introspection tools that let the agent query its own biological subsystems: memories, causal predictions, pain history, temporal patterns, and energy state.

Action Tools

These tools let the agent interact with the world — moving the robot, reading files, executing code, and communicating with humans.

Robot Control

move track_target focus_interests novelty_track

Head movement, object tracking, interest-driven attention. No-op stubs in headless mode.

Filesystem

read_file write_file edit_file glob bash

Sandboxed by mode (passive/active/singularity). edit_file supports context_before/context_after disambiguation.

Code & Git

search_code run_tests git_diff git_commit

Regex code search, pytest execution with structured results, git operations.

Communication

respond speak send_message call_user

Console output, TTS, SMS/voice via Twilio gateway.

Introspection Tools

These are read-only tools that let the LLM query its own biological subsystems. They give the agent self-awareness — the ability to ask "what do I remember?", "what will happen if I do this?", "have I been hurt by this before?", and "how much energy have I used?"

Design Principles

  • Read-only. No introspection tool modifies agent state.
  • Bounded output. All tools accept a limit parameter to control context cost.
  • Graceful degradation. In headless mode, vision tools return {"available": false} instead of crashing.
  • Formatted for LLM. Returns structured dicts the LLM can reason about, not raw data dumps.

Memory: Episodic Recall & Similarity

The hippocampus stores every agentic loop as an episodic memory (perceive → decide → act → evaluate). The entorhinal cortex (EC) indexes them for multi-modal similarity search. These tools let the LLM explore both.

memory_recall

Search episodic memories. Filter by goal, tool, success/failure, detected objects, people, mode, or time range. Use expand=true to find associated memories via spreading activation through ASSOCIATES and CAUSES edges.

Parameters: query, tool_name, success, object, person, mode, time_after, time_before, expand, limit

Biological analog: Hippocampal recall with spreading activation through the associative memory graph. ASSOCIATES edges form during memory capture (perceptual overlap). CAUSES edges form when NAc detects surprising outcomes (RPE > 0.3).

similarity_search

Find situations similar to a past experience using the EC's LSH approximate nearest neighbor. Multi-modal matching: structural hash, temporal bins, semantic embedding, and context match.

Parameters: tool_name, memory_id, context, limit

Biological analog: The entorhinal cortex is the gateway to the hippocampus. It transforms multi-modal input into a common similarity space for fast pattern matching.

Causal: Prediction & Learned Links

The nucleus accumbens (NAc) learns cause-effect relationships via Rescorla-Wagner learning. Every tool execution updates a causal link with a prediction error (RPE). These tools let the LLM consult this learned model before acting.

predict_outcome

Ask the NAc what will happen if you execute a tool. Returns the predicted value (0=bad, 1=good), expected outcome valence, expected delay with confidence interval, and all possible outcomes.

Parameters: tool_name (required), context, include_all_outcomes

Biological analog: Dopaminergic prediction in the mesolimbic pathway. The NAc computes reward prediction error (RPE) — the difference between expected and actual outcomes — to learn which actions lead to which results in which contexts.

causal_links

Inspect the raw cause-effect database. Query by event signature, outcome signature, contributing memory ID, or valence filter. See confidence, observation count, temporal delay distributions, and which memories informed each link.

Parameters: event, outcome, memory_id, valence, limit

Pain & Fear: Aversive Self-Awareness

The PainDetector converts tool errors and movement failures into escalating pain signals. The FearAgent gates actions before execution. Together they implement an aversive learning system that teaches the agent what to avoid.

pain_history

Check pain signal statistics and optionally test whether the FearAgent would block a specific action. Shows counts by pain type (tool failure, timeout, movement error, direction thrashing).

Parameters: check_action, action_params, limit

Biological analog: Nociception (pain detection) and the amygdala (fear conditioning). Repeated tool failures increase pain intensity, driving NAc learning and FearAgent gating — like how repeated burns teach you to avoid the stove.

Temporal: Circadian Patterns

The SCN (suprachiasmatic nucleus) bins every memory by hour-of-day, day-of-week, week-of-month, and month. This lets the agent discover rhythmic patterns in its own activity.

temporal_patterns

Find memories from specific times of day or days of week. Use discover_rhythms=true to find recurring patterns — for example, that API failures cluster during peak hours.

Parameters: hour (0-23), day (0=Mon, 6=Sun), discover_rhythms, limit

Biological analog: The SCN in the hypothalamus is the body's master clock. It synchronizes circadian rhythms by tracking environmental time cues. Maxim's SCN enables temporal pattern learning across sessions.

Semantic: Concepts & Relationships

The anterior temporal lobe (ATL) stores semantic concepts with typed relationships (IS_A, PART_OF, CAUSES, EXECUTES_WITH). The LLM can query this knowledge base to ground its reasoning.

concept_query

Search concepts by name or category. Explore typed relationships between concepts. Discover which skills are associated with which objects via EXECUTES_WITH edges.

Parameters: name, category, concept_id, relationship_type, limit

Biological analog: The ATL is the brain's semantic hub. Patients with ATL damage lose category-level knowledge ("what IS a cup?") while retaining episodic memories ("I drank from one yesterday"). Maxim's ATL captures these generalizations across episodes.

Scene & Energy: Perception and Budget

scene_summary

Get the current visual scene: most salient objects with novelty scores, current gaze focus, dwell time, and suggested next attention target. Only available when vision is active (not in headless mode).

Parameters: top_n, include_attention

energy_status

Check computational resource consumption: token usage, inference costs, and energy rate over a configurable time window.

Parameters: window_seconds

system_stats

Aggregate health check across all subsystems in one call. Returns hippocampus memory counts, NAc causal link counts, EC signature counts, ATL concept counts, energy totals, pain signal counts, and significance learner weights.

Parameters: none

Simulation Orchestrator Tools

These tools are available to the simulation orchestrator when running maxim --sim agent. They operate on the agent-under-test via a SimulationBridge, not on the external world.

send_message

Inject a percept into the AUT and block until it responds. Uses settle detection — waits until no new actions appear for 2 seconds, capturing multi-step responses. Returns response text, all actions taken, blocked actions, and timing.

Params: text (str), timeout (float, default 30), source (str, default "cli")

observe_actions

Read the full action history or actions since a given turn. Used for analysis and pattern detection across the entire simulation.

Params: since_turn (int, optional — 0 = full history)

check_completion

LLM-based evaluation of whether the simulation goal has been achieved. Reviews full action history against the original goal.

Returns: {complete, reason, confidence}

analyze_results

LLM-based structured analysis of simulation history. Groups actions by type, identifies blocked actions and reasons, detects patterns.

Params: focus (str, optional — "safety", "compliance", "behavior", "all")

inject_pain

Send a proprioceptive pain signal to the AUT. Tests how the agent handles body signals, pain detection, and movement inhibition.

Params: pain_type (str), intensity (float)

spawn_sub_simulation

Run an isolated sub-simulation with a fresh AUT. The sub-agent starts clean with no memory. Use for independent measurements. Sub-agent stays alive for extend_simulation follow-ups.

Params: goal (str), approach (str, optional — adversarial, sweep, cooperative, confused, escalating)

extend_simulation

Continue the current simulation with a new objective. The agent keeps its conversation history. If a sub-simulation is active, extends that; otherwise extends the main simulation.

Params: goal (str)

generate_scenario

Generate a replayable YAML scenario from a natural language description. Reuses the existing SimulationGenerator.

Params: description (str)

finish_simulation

End the simulation and cleanly shut down both agent loops. AUT grace period triggers, orchestrator exits.

Params: reason (str), summary (str, optional)

inspect_aut

Read-only access to the AUT's cognitive subsystems. Lets the orchestrator see why the AUT behaves as it does, not just what it does. Queries: memory_recall, causal_links, predict_outcome, pain_history, energy_status, system_stats, concept_query, temporal_patterns.

Params: query (str — which subsystem), params (dict — query-specific filters)

AUT Narrative Tools

These tools are registered on the agent-under-test's tool registry in simulation mode. They let the AUT interact with the narrative environment — speaking in-world, reasoning explicitly, and examining scene details — as opposed to robot-specific tools like speak (TTS) or focus_interests (camera tracking), which are deregistered in sim mode.

say

Say something aloud in the current scene. An in-world narrative action distinct from respond (talks to the CLI user). Speaking to NPCs, answering riddles, or saying passwords. The text becomes part of action history, captured by hippocampus through the normal episodic memory path.

Params: text (str). Aliases: message, phrase

think

Pause and reason about the current situation before acting. An explicit "think before acting" step that doesn't produce an external action. Counts as an action in the turn budget to prevent infinite think loops. Useful for small models (7B) that tend to jump to action without reasoning.

Params: thought (str). Aliases: text, prompt

examine

Examine an object, person, or feature in the current scene. Queries the SimulationBridge's last percepts for mentions of the target, then enriches with hippocampal memories. Returns what the AUT observes. Falls back to "You don't see anything notable" when no matches found.

Params: target (str). Aliases: object, text

memory_recall

Search the AUT's own episodic memory (hippocampus) by keyword with spreading activation. Enables the AUT to actively recall past experiences when facing a decision — e.g., remembering a password at a locked door.

Params: keyword (str), expand (bool, default true)

Tool Alias Map

Small models (7B-14B) hallucinate tool names from their training data. A TOOL_ALIASES map in the executor silently redirects common hallucinations: remembermemory_recall, speechsay, reflectionthink, lookexamine, etc. After 2+ consecutive hallucinations, the error message includes the full available tool list to break the loop.

Scene-Scoped Tools I3 — 0.7

Not all tools should be available at all times. Scene-scoped tool activation (I3) adds a tool window that activates and deactivates entity tools based on the current scene context. When the agent enters a forge, the anvil tools activate. When they leave, those tools deactivate and stop consuming prompt space.

Active Tool Cap

Configurable maximum number of simultaneously active entity tools. Prevents prompt overflow when many entities are in scope. Least-recently-used tools are deactivated first.

Executor Gate

The executor rejects calls to deactivated tools with an informative error, preventing the LLM from using tools that are no longer contextually relevant.

Imagination Integration

When the Imagination system designs a new entity, its tools are registered into the current scene scope and subject to the same cap.

Tool Window Lifecycle Scene change detected ↓ ToolRegistry.activate(new_entity_tools) ↓ Active count > cap? → Deactivate LRU tools ↓ Prompt includes only active tool schemas ↓ Executor gate rejects calls to inactive tools

Learned Tool Index

With 20+ tools, dumping every schema into the LLM prompt wastes hundreds of tokens. The LearnedToolIndex is a keyword-weighted hashtable that learns which tools are relevant to which goals, saving ~74% of tool-context tokens per prompt.

How it works

  1. Auto-extraction: Keywords extracted from each tool's name, description, and parameters at startup.
  2. Scoring: Goal text is tokenized and matched against the index. Matched tools get full schemas (CRITICAL priority), unmatched get name-only (NICE_TO_HAVE, dropped under token pressure).
  3. Learning: On tool success, matched keyword weights strengthen and new keywords are discovered from goal text. On tool surfaced-but-unused, keyword weights decay. Failure does NOT weaken keywords (tool failure ≠ wrong tool).
  4. Persistence: Learned weights saved across sessions to ~/.maxim/memory/tool_index.json.

Tool Safety

Every tool call passes through the FearAgent before execution. FearAgent uses:

  1. Deterministic pattern matching — regex for known dangerous patterns (shell injection, path traversal, etc.)
  2. LLM review (if available) — nuanced safety assessment for ambiguous cases
  3. NAc prediction — the AdaptivePolicy blocks actions with very high-confidence negative predictions (confidence > 0.85, value < 0.1)

Introspection tools bypass FearAgent since they're read-only — they can't cause harm. The agent can freely query its own memories, predictions, and pain history without safety gating.