Signature
← Back to Overview

MAXIM

Usage Guide

Installing, Configuring, and Running Maxim

Requirements

  • Python 3.12+
  • Hardware: Reachy Mini robot (or simulation mode for development)
  • RAM: 4GB minimum (8GB+ for larger LLMs)
  • GPU: Optional. Metal (macOS) or CUDA (Linux) supported
  • Network: Same LAN as Reachy Mini for Zenoh discovery

Blackwell GPU Note

RTX 50-series (Blackwell) GPUs have a known GStreamer/CUDA incompatibility. Maxim auto-detects this and falls back to CPU mode. You can force CPU mode manually with CUDA_VISIBLE_DEVICES="".

Installation

  1. Clone the repository
    git clone https://github.com/dennys246/Maxim.git cd Maxim
  2. Create a virtual environment
    python -m venv maxim-env source maxim-env/bin/activate
  3. Install core package
    pip install -e .
  4. Install LLM support (enables the agentic runtime)
    pip install -e '.[llm]'
  5. Download a language model
    ./scripts/download_models.sh --llm --enable

    This downloads SmolLM 1.7B (~1.1GB), the default CPU-friendly model.

Optional Extras

Extra Install Command What It Enables
TTS pip install -e '.[tts]' Text-to-speech (Piper TTS)
Semantic pip install -e '.[semantic]' Neural similarity (SentenceTransformer)
Torch LLM pip install -e '.[llm-torch]' PyTorch transformers backend

Configuration

Configuration Files

All configuration lives in data/util/:

File Purpose
llm.json LLM model selection, quantization, per-mode response sizing
robots.yaml Robot connection settings (type, name, timeout)
phrase_responses.json Voice command mappings ("Maxim sleep" → sleep mode)
key_responses.json Keyboard shortcut bindings
whisper.json Audio transcription settings (model size, compute type)

Environment Variables

Core settings MAXIM_LLM_ENABLED=1 # Enable LLM inference MAXIM_LLM_PROFILE=smollm-1.7b-instruct # Model profile MAXIM_LLM_QUANTIZATION=Q4_K_M # Quantization level MAXIM_PROMPT_PROFILE=standard # Prompt optimization tier MAXIM_ROBOT_NAME=reachy_mini # Robot identifier GPU control CUDA_VISIBLE_DEVICES="" # Force CPU-only MAXIM_LLM_N_GPU_LAYERS=0 # CPU inference MAXIM_LLM_N_GPU_LAYERS=-1 # All layers on GPU Display and audio MAXIM_DISABLE_IMSHOW=1 # Disable OpenCV windows MAXIM_WHISPER_COMPUTE_TYPE=float32 # Whisper precision fallback

Prompt Profiles

Three tiers of cognitive effort, matched to your hardware:

Profile Max Depth LLM Calls Parallelism Best For
minimal 2 levels 8 max None CPU-only, low RAM, Raspberry Pi
standard 5 levels 20 max 4 workers Laptop with GPU or fast CPU
rich 7 levels 50 max 8 workers Desktop with dedicated GPU

Operating Modes

Maxim uses a decomposed mode architecture combining processing states (awake/sleep), operational modes (passive/active/singularity), and strategies (observe/explore/research/assist/reflect/learn). For backward compatibility, the CLI accepts legacy mode names that map to specific combinations. See the Operating Modes deep dive for full details.

Operational Modes & Autonomy

Operational modes map directly to autonomy levels:

Mode Autonomy Behavior Max Initiative
passive planning Proposes actions, waits for human approval 0.3
active supervised Acts within defined boundaries, escalates on uncertainty 0.7
singularity autonomous Full agency with self-correction (safety constraints still apply) 1.0

CLI Reference

Basic usage maxim [OPTIONS] Connection --robot-name TEXT Robot identifier (default: reachy_mini) --home-dir PATH Data directory (default: data) --timeout FLOAT Connection timeout in seconds (default: 30.0) Execution --mode MODE exploration|live|observe|sleep|reflection|train|research --epochs INT Stop after N cycles (0 = unlimited) --audio true|false Enable audio recording (default: true) --audio_len FLOAT Transcription chunk duration (default: 5.0s) --interactive true|false Enable keyboard input (default: true) Agentic mode --language-model TEXT LLM profile (e.g., mistral-7b) --prompt-profile TIER minimal|standard|rich --autonomy LEVEL planning|supervised|autonomous --autonomy-duration FLOAT Timed autonomy in seconds --memory-path PATH Memory persistence file --reset Clear memory on startup --enable-embeddings Enable semantic similarity (Phase 4) Network --internet-access Enable internet (default) --no-internet Disable internet access Audio / TTS --tts Enable text-to-speech --tts-model TEXT Voice model (default: en_US-lessac-medium) Exploration --explore [FOCUS] Start exploration with optional focus topic --exploration-duration FLOAT Session duration in seconds --exploration-autonomy LEVEL supervised|autonomous --exploration-allow-scripts Allow Python script generation --exploration-allow-training Allow model training during exploration --resume-session ID Resume a previous session --list-sessions List available sessions Maintenance --clear-cache Remove __pycache__ directories --clear-memory [TYPE] Clear persistent memory and exit --verbosity 0|1|2 Logging level --agentic-verbosity 0|1|2|3 Agentic system logging detail --no-agentic-console Suppress agentic event output

Voice Commands

All voice commands begin with the wake word "Maxim":

Command Effect
"Maxim sleep" Enter sleep mode (audio monitoring only)
"Maxim wake up" Return to previous active mode
"Maxim observe" Switch to observe strategy
"Maxim explore" Switch to explore strategy
"Maxim assist" Switch to assist strategy
"Maxim reflect" Switch to reflect strategy
"Maxim passive" / "active" / "singularity" Switch operational mode
"Maxim shutdown" Clean shutdown

See the Operating Modes page for the full list. Custom voice commands can be added in data/util/phrase_responses.json.

Keyboard Controls

Available in interactive mode (default):

c Center vision
u Mark trainable
0 Label: no errors
1-9 Label: error code
q Quit

Common Recipes

First Run (No Robot)

Test the system without hardware using simulation:

maxim --mode exploration

Full Agentic Mode with Mistral

maxim --mode live \ --language-model mistral-7b \ --prompt-profile standard \ --autonomy supervised

CPU-Only with Small Model

CUDA_VISIBLE_DEVICES="" maxim \ --mode live \ --language-model smollm-1.7b \ --prompt-profile minimal

Timed Autonomous Exploration

maxim --explore "kitchen objects" \ --exploration-duration 300 \ --exploration-autonomy autonomous

Verbose Debugging Session

maxim --mode live \ --verbosity 2 \ --agentic-verbosity 3 \ --language-model phi-3-mini

Resume a Previous Session

# List available sessions maxim --list-sessions # Resume by ID maxim --resume-session abc123

Memory Management

Clearing Memory

Clear everything maxim --clear-memory all Clear specific types maxim --clear-memory focus,bounds # Movement learning only maxim --clear-memory nac,hippo # Decision learning + episodes maxim --clear-memory pain,fear # Safety learning only

Available Memory Types

Type What It Clears Effect
focus FocusLearner gains Resets movement calibration
bounds Workspace limits Relearns reachable space
nac Causal links Forgets action-outcome predictions
scn Temporal patterns Forgets time-of-day associations
hippo Episodic memories Complete amnesia
pain Pain thresholds Resets pain sensitivity
fear Fear associations Forgets learned dangers
escalation Escalation thresholds Resets when to ask for help
threshold Adaptive thresholds Resets all learned limits
semantic Neural embeddings Resets similarity cache

Where Data Lives

Directory structure data/ ├── util/ # Learned state (JSON) │ ├── hippocampus.json │ ├── nac_state.json │ ├── scn_state.json │ ├── focus_learner.json │ ├── learned_bounds.json │ ├── fear_learning.json │ └── ... ├── videos/ # Recorded video (MP4) ├── audio/ # Recorded audio (WAV) ├── transcript/ # Transcriptions (JSONL) ├── training/ # Motor training samples (JSONL) ├── plans/checkpoints/ # Goal tree snapshots ├── models/ # ML models │ ├── LLM/ # GGUF language models │ ├── MotorCortex/ # Vision-to-motor model │ └── YOLO/ # Object detection ├── sandbox/ # Safe space for generated files └── logs/ # Session logs

Debugging

Verbosity Levels

Level What You See
0 Errors only
1 Key events (mode changes, goals, tool calls)
2 Detailed processing (every detection, memory query, decision)
3 (agentic only) Full trace (LLM prompts, raw responses, bridge activity)

Inspecting Learned State

Python: inspect memory from maxim.memory.hippocampus import Hippocampus hippo = Hippocampus.load_from_file("data/util/hippocampus.json") print(f"Total memories: {len(hippo.memories)}") for mem_id, mem in list(hippo.memories.items())[:5]: print(f" {mem.action.tool_name} → {mem.outcome.valence}") Python: inspect NAc learning from maxim.decisions.nac import NAc nac = NAc.load_from_file("data/util/nac_state.json") for sig, pred in nac._learned_associations.items(): print(f" {sig}: conf={pred.confidence:.2f}, val={pred.valence}")

Hardware Diagnostics

maxim-diagnostics --host 192.168.1.100

Checks Zenoh connection, motor controller, video stream, and audio stream availability.

Log Files

Session logs are written to data/logs/reachy_log_YYYY-MM-DD_HHMMSS.log with timestamps, thread IDs, and structured event data.

Getting Help

Maxim is open source. If you run into issues, check the GitHub repository for the latest documentation and issue tracker.