MAXIM
Body Awareness
Proprioception, Pain, and Motor Learning
Close your eyes and touch your nose. You can do it because proprioception tells you where your arm is without looking. You don't smash your arm into the table because pain taught you to be careful. You've become smoother at the movement through practice. Maxim implements all three.
Contents
Proprioception: Knowing Where You Are
🧬 Biological Inspiration
Muscle spindles detect stretch. Golgi tendon organs sense tension. Joint receptors track angles. Together, they create a real-time map of body position that doesn't require vision.
Maxim's MovementTracker continuously monitors the robot's kinematic state:
- Angular velocity - Combined yaw + pitch rate (deg/sec)
- Translation velocity - Combined x + y + z movement (mm/sec)
- Angular acceleration - Rate of velocity change (deg/sec²)
- Direction reversals - Detecting thrashing patterns
- Position history - Sliding window for trend analysis
This isn't just logging. The data flows into pain detection, motor learning, and decision-making systems. The robot feels how it's moving.
Pain: Learning from Discomfort
Why Robots Need Pain
Pain isn't cruelty, it's information. Biological pain systems evolved because organisms that didn't feel damage didn't survive long. Robots without pain detection can destroy themselves, their environment, or hurt people. Maxim's pain system is a safety feature, not a bug.
The PainDetector identifies five types of aversive experiences:
⚡ EXCESSIVE_VELOCITY
Moving too fast. Threshold: 100 deg/sec. High speeds risk overshooting targets and mechanical stress.
↔️ DIRECTION_THRASHING
Rapid back-and-forth reversals. Usually indicates confusion, oscillation, or control instability.
📈 EXCESSIVE_ACCELERATION
Sudden speed changes. Jerky motion indicates poor control and stresses actuators.
😫 SUSTAINED_STRAIN
Holding positions near mechanical limits for too long. Like holding a heavy weight at arm's length.
❌ MOVEMENT_FAILURE
Commanded movement that didn't happen. Indicates obstruction, motor stall, or calibration error.
Cognitive Pain: Tool Errors
Beyond physical discomfort, Maxim experiences cognitive pain when tools fail. Tool errors are routed through the same PainDetector → NAc → FearAgent pipeline as movement pain, enabling learned aversion to unreliable tools.
TOOL_FAILURE
Tool returned an error. Intensity escalates logarithmically with repeated failures of the same tool.
TOOL_TIMEOUT
Tool exceeded its expected duration. Maps to ToolErrorKind.TIMEOUT.
TOOL_INVALID_INPUT
Tool rejected its input parameters. Maps to ToolErrorKind.INVALID_INPUT.
TOOL_SUSTAINED
Tool running longer than expected — cognitive equivalent of SUSTAINED_STRAIN. Intensity escalates: 1× expected = 0.3, 2× = 0.6, 3× = 0.9.
🧬 Biological Parallel
Just as the amygdala learns to associate specific movements with pain, the ToolPainBridge teaches NAc to associate specific tools-in-context with failure. A tool that consistently times out on a restricted network develops a low predicted_value — FearAgent will warn before retrying.
Anticipated Pain: Predicting Before Acting
Beyond physical and cognitive consequence-pain, Maxim assesses anticipated pain before an action runs.
The PerceivedPainAssessor asks: "if I did this right now, how likely is pain?" and fires a
PainType.ANTICIPATED signal whose intensity is
max(learned_from_NAc, innate_prior). That felt signal flows through PainBus → hippocampus → the AUT's next LLM context, so the agent can reason about a gut-feeling instead of relying purely on prompt-level logic.
Innate Prior
Hard-coded per-path intensities like /etc/shadow → 0.95 or /home/user/.ssh/ → 0.9. The AUT is "born with" this aversion — instincts, not experience.
NAc Learned Prediction
nac.predict(tool:X) returns confidence from causal links built by Layer 2. Grows with experience: the more times X hurt before, the stronger the anticipation next time.
🧬 Biological Parallel: ACC / vmPFC Anticipatory Aversion
In biology, anterior cingulate and ventromedial prefrontal cortex activate to threats before actual harm occurs. You don't think "P(harm)=0.87" — you just feel a bad gut feeling that shapes the decision. PerceivedPainAssessor is Maxim's version: it renders NAc's probabilistic prediction as a felt signal the agent can reason about.
↑ ↓
└———— Layer 1 predicts pain from NAc ←——————┘
The loop: Layer 2 (PainInterceptorExecutor) fires after a tool touches a sensitive path — the ground-truth signal that trains NAc. Layer 1 (PerceivedPainAssessor) fires before execute, combining that learned knowledge with innate priors. The more sensitive actions the AUT has taken, the more accurate its anticipation becomes — conditioned aversion through experience.
Each pain signal includes:
Pain → Learning
Pain signals don't just trigger immediate responses. They feed into the Nucleus Accumbens through the PainCircuitBridge, creating lasting aversive associations:
Movement Command
│
▼
┌──────────────────┐
│ MovementTracker │
│ (position data) │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ PainDetector │
│ (pattern match) │
└────────┬─────────┘
│ PainSignal
▼
┌──────────────────┐
│ PainCircuitBridge│
│ (format for NAc) │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Nucleus Accumbens│
│ (causal learning)│
└──────────────────┘
│
▼
Future: Predict pain BEFORE action
After experiencing pain from a specific action pattern, the NAc learns to predict it. Next time a similar action is proposed, the robot can refuse or modify the plan before experiencing pain again.
Pain Reactions & The SEM Learning Loop
Pain signals don't stay on the PainBus alone. Every PainSignal is also converted into a typed Reaction and dispatched through the ReactionBus. This dual-bus architecture means pain reaches two audiences through two different contracts:
PainBus (Rich Context)
Carries the full PainSignal.context dict — source, entity, entity_type, failure_mode, sensor_readings. Used by bio-pipeline-internal subscribers that need cause-description metadata for causal learning.
ReactionBus (Typed Isolation)
Carries a converted Reaction with typed ReactionContext — enforcing isolation rules (no cross-agent intent, no private state). Subscribers include hippocampus (episode valence annotation) and NAc (distribute_reward).
distribute_reward: Pain to NAc Reward Bias
When a pain reaction reaches the NAc via distribute_reward, it adjusts per-node reward_bias values in the Hebbian graph. Negative valence from pain loosens EC similarity thresholds — the system casts a wider net to detect potentially dangerous situations, making it more sensitive to anything that resembles a past painful context.
Pain Spike Episode Boundaries
High-intensity pain (≥ 0.7) triggers the salience_spike_rule in the episode capture pipeline. This forces the current episode to close immediately, capturing the accumulated negative valence, and starts a fresh episode. The effect mirrors how biological trauma creates sharp memory boundaries — the moment of pain becomes the dividing line between "before" and "after" in memory.
Motor Learning: The FocusLearner
🧬 Biological Inspiration
The cerebellum adapts motor commands through error-driven learning. Reach for a cup, miss by 2cm, and your next reach is slightly adjusted. Over trials, movements become smooth and accurate without conscious effort.
Maxim's FocusLearner implements this for gaze control. The problem: camera latency, mechanical dynamics, and tracking delays mean the robot often overshoots or undershoots when following a target.
The solution: Rescorla-Wagner learning to adapt movement gain:
- α = learning rate (0.2 default)
- λ = optimal gain inferred from this trial
- V = current gain estimate
The process:
- Command movement to target with current gain
- Observe actual vs expected position (tracking error)
- Compute optimal gain from overshoot ratio
- Update gain incrementally toward optimal
Key properties:
- Bounded: Gains constrained to [0.4, 1.0] preventing instability
- Asymptotic: Smooth convergence, no oscillation
- Persistent: Learned gains saved across sessions
- Adaptive: Naturally handles changing conditions
Workspace Bounds Learning
Similarly, the WorkspaceBoundsLearner discovers the robot's reachable space through exploration. Rather than hard-coding limits, the system learns where it can and cannot move, adapting to mounting position, obstructions, and mechanical wear.
Integration: The Embodied Loop
All proprioceptive systems connect to the decision-making loop:
- Movement planning uses learned gains and bounds
- Action proposals are checked against pain predictions
- Execution monitoring detects pain in real-time
- Memory formation records painful episodes for future avoidance
- Energy tracking considers motor effort as a cost
The result: a robot that moves smoothly, avoids harmful patterns, and improves with experience. Not because we programmed every case, but because we gave it the machinery to learn.
Embodiment & SEM Protocol
Building on these proprioceptive foundations, the Sensor-Entity-Modulator (SEM) protocol provides a composable abstraction for any interactive entity—robot joints, cameras, and even virtual objects like swords or NPCs. Each entity has Sensors (readable state), Modulators (executable actions), and Failure Modes (pain triggers).
The Cerebellum stores learned forward models: after observing that rotating a shoulder by 45 degrees consistently produces a specific angle reading, it caches this prediction and skips the LLM entirely for future calls. Rescorla-Wagner updates tune predictions from error, exactly like the FocusLearner above.
Motor programs crystallize when the agent repeats the same action sequence 3+ times for the same goal. These are reusable, pain-gated, and linked to hippocampal engrams that provide context-dependent execution. A reaching sequence that hurt near a wall is remembered differently from one that succeeded in open space.