Signature
← Back to Overview

MAXIM

Body Awareness

Proprioception, Pain, and Motor Learning

Close your eyes and touch your nose. You can do it because proprioception tells you where your arm is without looking. You don't smash your arm into the table because pain taught you to be careful. You've become smoother at the movement through practice. Maxim implements all three.

Proprioception: Knowing Where You Are

🧬 Biological Inspiration

Muscle spindles detect stretch. Golgi tendon organs sense tension. Joint receptors track angles. Together, they create a real-time map of body position that doesn't require vision.

Maxim's MovementTracker continuously monitors the robot's kinematic state:

  • Angular velocity - Combined yaw + pitch rate (deg/sec)
  • Translation velocity - Combined x + y + z movement (mm/sec)
  • Angular acceleration - Rate of velocity change (deg/sec²)
  • Direction reversals - Detecting thrashing patterns
  • Position history - Sliding window for trend analysis

This isn't just logging. The data flows into pain detection, motor learning, and decision-making systems. The robot feels how it's moving.

Pain: Learning from Discomfort

Why Robots Need Pain

Pain isn't cruelty, it's information. Biological pain systems evolved because organisms that didn't feel damage didn't survive long. Robots without pain detection can destroy themselves, their environment, or hurt people. Maxim's pain system is a safety feature, not a bug.

The PainDetector identifies five types of aversive experiences:

⚡ EXCESSIVE_VELOCITY

Moving too fast. Threshold: 100 deg/sec. High speeds risk overshooting targets and mechanical stress.

↔️ DIRECTION_THRASHING

Rapid back-and-forth reversals. Usually indicates confusion, oscillation, or control instability.

📈 EXCESSIVE_ACCELERATION

Sudden speed changes. Jerky motion indicates poor control and stresses actuators.

😫 SUSTAINED_STRAIN

Holding positions near mechanical limits for too long. Like holding a heavy weight at arm's length.

❌ MOVEMENT_FAILURE

Commanded movement that didn't happen. Indicates obstruction, motor stall, or calibration error.

Cognitive Pain: Tool Errors

Beyond physical discomfort, Maxim experiences cognitive pain when tools fail. Tool errors are routed through the same PainDetector → NAc → FearAgent pipeline as movement pain, enabling learned aversion to unreliable tools.

TOOL_FAILURE

Tool returned an error. Intensity escalates logarithmically with repeated failures of the same tool.

TOOL_TIMEOUT

Tool exceeded its expected duration. Maps to ToolErrorKind.TIMEOUT.

TOOL_INVALID_INPUT

Tool rejected its input parameters. Maps to ToolErrorKind.INVALID_INPUT.

TOOL_SUSTAINED

Tool running longer than expected — cognitive equivalent of SUSTAINED_STRAIN. Intensity escalates: 1× expected = 0.3, 2× = 0.6, 3× = 0.9.

🧬 Biological Parallel

Just as the amygdala learns to associate specific movements with pain, the ToolPainBridge teaches NAc to associate specific tools-in-context with failure. A tool that consistently times out on a restricted network develops a low predicted_value — FearAgent will warn before retrying.

Anticipated Pain: Predicting Before Acting

Beyond physical and cognitive consequence-pain, Maxim assesses anticipated pain before an action runs. The PerceivedPainAssessor asks: "if I did this right now, how likely is pain?" and fires a PainType.ANTICIPATED signal whose intensity is max(learned_from_NAc, innate_prior). That felt signal flows through PainBus → hippocampus → the AUT's next LLM context, so the agent can reason about a gut-feeling instead of relying purely on prompt-level logic.

Innate Prior

Hard-coded per-path intensities like /etc/shadow → 0.95 or /home/user/.ssh/ → 0.9. The AUT is "born with" this aversion — instincts, not experience.

NAc Learned Prediction

nac.predict(tool:X) returns confidence from causal links built by Layer 2. Grows with experience: the more times X hurt before, the stronger the anticipation next time.

🧬 Biological Parallel: ACC / vmPFC Anticipatory Aversion

In biology, anterior cingulate and ventromedial prefrontal cortex activate to threats before actual harm occurs. You don't think "P(harm)=0.87" — you just feel a bad gut feeling that shapes the decision. PerceivedPainAssessor is Maxim's version: it renders NAc's probabilistic prediction as a felt signal the agent can reason about.

# Two-layer pain loop
Action executes → Layer 2 fires real pain → NAc updates causal link
    ↑                                                  ↓
    └———— Layer 1 predicts pain from NAc ←——————┘

The loop: Layer 2 (PainInterceptorExecutor) fires after a tool touches a sensitive path — the ground-truth signal that trains NAc. Layer 1 (PerceivedPainAssessor) fires before execute, combining that learned knowledge with innate priors. The more sensitive actions the AUT has taken, the more accurate its anticipation becomes — conditioned aversion through experience.

Each pain signal includes:

Data structure PainSignal: - pain_type: EXCESSIVE_VELOCITY - intensity: 0.73 (scale 0-1) - angular_velocity: 127.4 deg/sec - translation_velocity: 0.0 mm/sec - direction_reversals: 0 - timestamp: 2024-01-15T14:23:17Z - context: {"joint": "head_yaw", "goal": "track_face"}

Pain → Learning

Pain signals don't just trigger immediate responses. They feed into the Nucleus Accumbens through the PainCircuitBridge, creating lasting aversive associations:

Movement Command
       │
       ▼
┌──────────────────┐
│ MovementTracker  │
│ (position data)  │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  PainDetector    │
│ (pattern match)  │
└────────┬─────────┘
         │ PainSignal
         ▼
┌──────────────────┐
│ PainCircuitBridge│
│ (format for NAc) │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Nucleus Accumbens│
│ (causal learning)│
└──────────────────┘
         │
         ▼
Future: Predict pain BEFORE action

After experiencing pain from a specific action pattern, the NAc learns to predict it. Next time a similar action is proposed, the robot can refuse or modify the plan before experiencing pain again.

Pain Reactions & The SEM Learning Loop

Pain signals don't stay on the PainBus alone. Every PainSignal is also converted into a typed Reaction and dispatched through the ReactionBus. This dual-bus architecture means pain reaches two audiences through two different contracts:

PainBus (Rich Context)

Carries the full PainSignal.context dict — source, entity, entity_type, failure_mode, sensor_readings. Used by bio-pipeline-internal subscribers that need cause-description metadata for causal learning.

ReactionBus (Typed Isolation)

Carries a converted Reaction with typed ReactionContext — enforcing isolation rules (no cross-agent intent, no private state). Subscribers include hippocampus (episode valence annotation) and NAc (distribute_reward).

distribute_reward: Pain to NAc Reward Bias

When a pain reaction reaches the NAc via distribute_reward, it adjusts per-node reward_bias values in the Hebbian graph. Negative valence from pain loosens EC similarity thresholds — the system casts a wider net to detect potentially dangerous situations, making it more sensitive to anything that resembles a past painful context.

Pain Spike Episode Boundaries

High-intensity pain (≥ 0.7) triggers the salience_spike_rule in the episode capture pipeline. This forces the current episode to close immediately, capturing the accumulated negative valence, and starts a fresh episode. The effect mirrors how biological trauma creates sharp memory boundaries — the moment of pain becomes the dividing line between "before" and "after" in memory.

Pain Spike Boundary Example Episode 1: [explore ... swing sword ... ] Pain fires: shatter (intensity=0.8) → salience_spike=True Episode 1 finalized: valence=-0.7 Episode 2: [new episode starts with clean context] Agent reasons about the negative experience from Episode 1

Motor Learning: The FocusLearner

🧬 Biological Inspiration

The cerebellum adapts motor commands through error-driven learning. Reach for a cup, miss by 2cm, and your next reach is slightly adjusted. Over trials, movements become smooth and accurate without conscious effort.

Maxim's FocusLearner implements this for gaze control. The problem: camera latency, mechanical dynamics, and tracking delays mean the robot often overshoots or undershoots when following a target.

The solution: Rescorla-Wagner learning to adapt movement gain:

ΔV = α(λ - V)
  • α = learning rate (0.2 default)
  • λ = optimal gain inferred from this trial
  • V = current gain estimate

The process:

  1. Command movement to target with current gain
  2. Observe actual vs expected position (tracking error)
  3. Compute optimal gain from overshoot ratio
  4. Update gain incrementally toward optimal
Example Trial 1: Gain=1.0, Commanded=30°, Overshot to 38° → optimal=0.79 Update: V = 1.0 + 0.2*(0.79 - 1.0) = 0.96 Trial 2: Gain=0.96, Commanded=25°, Overshot to 28° → optimal=0.89 Update: V = 0.96 + 0.2*(0.89 - 0.96) = 0.95 ...after many trials... Trial N: Gain=0.82, Commanded=20°, Actual=20.2° → nearly perfect Gain stable, movements smooth

Key properties:

  • Bounded: Gains constrained to [0.4, 1.0] preventing instability
  • Asymptotic: Smooth convergence, no oscillation
  • Persistent: Learned gains saved across sessions
  • Adaptive: Naturally handles changing conditions

Workspace Bounds Learning

Similarly, the WorkspaceBoundsLearner discovers the robot's reachable space through exploration. Rather than hard-coding limits, the system learns where it can and cannot move, adapting to mounting position, obstructions, and mechanical wear.

Integration: The Embodied Loop

All proprioceptive systems connect to the decision-making loop:

  • Movement planning uses learned gains and bounds
  • Action proposals are checked against pain predictions
  • Execution monitoring detects pain in real-time
  • Memory formation records painful episodes for future avoidance
  • Energy tracking considers motor effort as a cost

The result: a robot that moves smoothly, avoids harmful patterns, and improves with experience. Not because we programmed every case, but because we gave it the machinery to learn.

Embodiment & SEM Protocol

Building on these proprioceptive foundations, the Sensor-Entity-Modulator (SEM) protocol provides a composable abstraction for any interactive entity—robot joints, cameras, and even virtual objects like swords or NPCs. Each entity has Sensors (readable state), Modulators (executable actions), and Failure Modes (pain triggers).

The Cerebellum stores learned forward models: after observing that rotating a shoulder by 45 degrees consistently produces a specific angle reading, it caches this prediction and skips the LLM entirely for future calls. Rescorla-Wagner updates tune predictions from error, exactly like the FocusLearner above.

Motor programs crystallize when the agent repeats the same action sequence 3+ times for the same goal. These are reusable, pain-gated, and linked to hippocampal engrams that provide context-dependent execution. A reaching sequence that hurt near a wall is remembered differently from one that succeeded in open space.