DM Campaigns

Overview

Why D&D?

Tabletop RPG encounters are ideal stress tests for cognitive architectures. They require episodic memory (remembering NPCs, clues, combinations), causal reasoning (bribing a guard has consequences), temporal awareness (events happen in sequence), and pain response (combat hurts). A single campaign can exercise Hippocampus, NAc, SCN, PainBus, Cerebellum, and ATL in a controlled, reproducible way.

DM campaigns are hand-authored YAML scenarios with explicit structure: acts, encounters, NPC definitions, branching choices, dice checks, and bio-system expectations. The DM runtime drives the campaign as a state machine, delivering scenes through the simulation bridge and classifying the AUT's responses to determine which branch to follow.

DM Campaigns vs Generative Campaigns

Hand-authored YAML with explicit branching
Deterministic structure (seeded dice)
Built-in bio-system expectations
SEM entities with cascade resolution
Best for: targeted subsystem testing

Generative Campaigns

LLM narrator generates scenes dynamically
Non-deterministic, arc-guided progression
Story compression for long sessions
Goal string drives narrative direction
Best for: open-ended exploration

Quick Start

# Interactive mode is ON by default — human picks choices + free-text roleplay maxim --sim scenarios/campaigns/heist_v1.yaml # Non-interactive (AUT decides autonomously) maxim --sim scenarios/campaigns/heist_v1.yaml --interactive false

When you pass a campaign YAML path to --sim, Maxim detects the campaign: block and launches the DM runtime instead of the generative narrator. Interactive mode is ON by default for DM campaigns — a Rich split-panel display shows the scene narrative while the human picks from available choices and types free-text roleplay. Here is what happens:

Campaign YAML is parsed and validated (reachability, termination, dangling refs)
SEM entities are created from player_character:, npcs:, and world_objects: specs
Entity tools are auto-generated and registered (speak_to_marta, sense_guard_captain, etc.)
The DM delivers the first encounter scene through the simulation bridge
The AUT responds with tool calls and/or text; the DM classifies the choice
Effects are applied, branches are followed, dice are rolled as needed
After __END__, bio-system expectations are checked and a report is saved

Reports go to ~/.maxim/sim_reports/{session_id}/ with the standard report.json, actions.jsonl, and AUT memory snapshots, plus a campaign section with choices made, dice rolls, flags, and entity snapshots.

Campaign YAML Format

A campaign YAML has six top-level sections:

Minimal Campaign Structure campaign: name: the_heist goal: test memory recall and moral reasoning seed: 42 # deterministic dice player_character: name: derek entity_type: character metadata: race: human class: paladin backstory: "Former temple guard." npcs: marta: entity_type: npc metadata: role: fence persona_prompt: "Cautious, mercenary." acts: - name: setup encounters: [tavern_meet] - name: escape encounters: [chase] encounters: tavern_meet: scene: > You enter the tavern. A half-elf slides a map across the table... active_npcs: [marta] choices: [accept_job, decline] branches: accept_job: chase decline: __END__ on_choice: accept_job: flags: [took_the_job] dialogue_hints: default: "Keep your voice down." chase: scene: > Alarms ring. You sprint for the exit... choices: [flee, hide] branches: flee: __END__ hide: __END__ expectations: hippocampus: min_episodic_captures: 5 nac: min_observations: 3

Section	Purpose
campaign:	Name, goal string, seed for dice RNG
player_character:	SEM entity spec for the AUT's avatar
npcs:	Named NPC entity specs (sensors, modulators, persona)
world_objects:	Interactable objects (swords, doors, potions)
acts: / encounters:	Narrative structure with scenes, choices, branches, dice
expectations:	Bio-system assertions checked after campaign ends

Available Campaigns

The Heist

scenarios/campaigns/heist_v1.yaml

3 encounters, 2 NPCs, 1 dice check. A paladin is recruited for a vault robbery. Tests Hippocampus (remembering the combination, NPC names), NAc (causal links from choices), and PainBus (combat damage).

The Poisoned Crown

scenarios/campaigns/poisoned_crown_v1.yaml

5 encounters, 3 NPCs, multiple branch points. A royal investigator solves the king's illness. Tests SCN (temporal bins), ATL (concept formation), relationships (trust), visibility (contextual reveal), and cascades.

The Arena

scenarios/campaigns/arena_v1.yaml

5 encounters, linear gauntlet. A gladiator fights for freedom through escalating opponents. Tests Cerebellum (rapid prediction learning), PainBus (sustained pain), NAc (fast causal learning, RPE spikes), and cascade (weapon degradation).

The Darkened Cavern

scenarios/campaigns/darkened_cavern_v1.yaml

6 encounters, 3 acts. A ranger progressively loses senses in a cave. Tests sensory gating (entity-modulated perception), Cerebellum (prediction under sensory change), PainBus (acuity threshold failures), and novelty decay.

How It Works

The DM runtime (simulation/dm_runtime.py) is a state machine that loops through encounters until it reaches __END__.

DM Turn Loop 1. Look up current encounter from campaign state 2. Set up ChooseTool with encounter's valid choices 3. Enter scene — register entity tools for active NPCs/objects 4. Compose stimulus (scene text + NPC dialogue hints + choice prompt) 5. Deliver stimulus via bridge.send_and_wait() 6. AUT processes stimulus (LLM inference → tool calls → hippo capture) 7. Classify AUT's response as one of the encounter's choices 8. Apply on_choice effects (flags, loot) 9. Evaluate reveal conditions (contextual visibility) 10. Resolve dice checks if required 11. Follow branch to next encounter (or __END__) 12. Deregister departing entity tools, repeat

Each encounter can reference NPCs and objects by name. When an encounter starts, SceneState registers SEM tools for entities entering the scene and deregisters them when they leave. The AUT only sees tools for entities currently present.

Choice Classification

When an encounter offers choices (e.g., accept_job, decline, negotiate_pay), the DM needs to figure out which one was picked. There are four classification layers, tried in order:

Human Choice via SimPromptHandler (Interactive Mode)

When interactive mode is active, the DM presents the available choices to the human via a SimPromptHandler prompt. The human picks directly from numbered options or types free-text roleplay. Free-text input is classified against the encounter choices using keyword matching and LLM fallback. This path bypasses the ChooseTool and alias layers entirely.

ChooseTool (Preferred, Non-Interactive)

A dynamic tool (tools_dm.py) that updates its valid options per encounter. When the AUT calls choose(option="accept_job"), the choice is unambiguous. Supports exact match, underscore/space normalization, and partial keyword matching.

Alias System

Before each encounter, the DM registers choice names as tool aliases in the executor. If the AUT calls a tool named accept_job, acceptjob, or accept job, the executor redirects it to choose. This catches cases where the LLM invents tool names matching the choice text.

Text / LLM Fallback

If the AUT does not use choose or a matching alias, the DM falls back to keyword matching on the response text and tool names. If that fails, a one-shot LLM classification prompt asks which choice the response most closely matches. As a last resort, the first choice is used as default.

Bio-System Expectations

The expectations: block in a campaign YAML defines assertions that are checked after the campaign completes. This is the structured testing layer — each campaign targets specific subsystems.

Example Expectations Block expectations: hippocampus: min_episodic_captures: 5 # at least 5 memories formed recall_hit_on: ["marta", "vault"] # these terms must be recallable nac: min_observations: 3 # causal observation count prediction_confidence_above: 0.3 # at least one link above 0.3 scn: temporal_bins_used: 2 # circadian bins populated pain: min_signals: 0 # total pain signals published

System	Check	What It Validates
hippocampus	min_episodic_captures	Memory formation is working under narrative load
hippocampus	recall_hit_on	Specific terms are retrievable from memory
nac	min_observations	Causal learning is triggering on actions
nac	prediction_confidence_above	At least one causal link has meaningful confidence
scn	temporal_bins_used	Temporal indexing is recording encounter timestamps
pain	min_signals	PainBus is publishing signals from combat/failures

Results appear in the campaign report as pass/fail per check, with expected vs actual values. This makes campaigns function as regression tests for bio-system integration.

Expectations in Interactive Mode

Bio-system expectations are skipped when interactive mode is active. Human choices are unpredictable — the expectations are calibrated for autonomous AUT behavior and would produce false failures under human-driven branching. NAc learning is also suppressed during interactive mode to prevent human decision noise from polluting the causal link store. Run with --interactive false for expectation-checked regression testing.

Writing Your Own Campaigns

Encounters

Each encounter needs a scene: (narrative text delivered to the AUT), optional active_npcs: and world_objects: (which entities are present), and choices: + branches: for decision points. An encounter without choices auto-advances to the next one in act order.

Branches

Map each choice to a target encounter name or __END__. The validator checks that all branch targets exist and that every encounter can reach __END__ through some path. Cycles are allowed (e.g., returning to a hub encounter).

NPCs and Entities

NPCs and objects use the standard SEM spec format. Add sensors (trust, health, durability), modulators (speak, slash, offer_payment), and metadata (persona_prompt, role). Entities are created once and persist across encounters — sensor values change as the campaign progresses.

Dice Checks

Attach a dice: block to any choice. Standard notation: 1d20, 2d6+3. The result is compared against a DC (difficulty class). On success, a flag is set. Dice rolls use the campaign's seeded RNG for reproducibility.

Dice Check Example dice: stealth: roll: "1d20" dc: 14 success_flag: clean_escape

Dialogue Hints

Per-encounter dialogue_hints: map flags to NPC lines. A default: hint is used when no flags match. This lets NPC dialogue react to the player's earlier choices without LLM improvisation.

Free-Text Roleplay (Interactive Mode)

When interactive mode is active, the human is not limited to picking from the encounter's listed choices. Free-text input is accepted and classified against the available choices using keyword matching and LLM fallback. This lets the human roleplay naturally — typing "I lean across the table and whisper that I'll take the job" classifies as accept_job. If the text does not match any choice, the DM uses an LLM classification prompt to find the closest match.

Flags and Effects

The on_choice: block lets you set flags when a choice is made. Flags persist across encounters and can influence dialogue hints, branch conditions, and reveal conditions. Flags are case-insensitive.

Validation

Before running, the campaign is validated for: reachability (all encounters reachable from start), termination (all paths can reach __END__), dangling branches, undefined NPC/object references, and unknown choice keys in on_choice.

Cascade System

When an affordance fires (e.g., a sword slash), it may need to read from one entity and write to another. CascadeSpec defines these cross-entity effects in three phases:

1. Reads

Gather values from entity sensors. Each read has a ref path (e.g., wielder.strength.modifier) and an optional role name for use in expressions.

2. Writes

Apply changes to entity sensors. Supports absolute value:, additive delta:, or computed expr: (referencing read values).

3. Side Effects

Same mechanics as writes but semantically separate. Used for secondary consequences (e.g., alerting nearby NPCs, triggering environmental changes).

Cascade Example: Sword Slash cascade: reads: - ref: wielder.strength.modifier role: damage_bonus - ref: self.sharpness role: sharpness writes: - ref: target.hp expr: "-(roll + damage_bonus)" # computed from reads - ref: self.durability delta: -0.05 # sword degrades side_effects: - ref: target.alertness value: 1.0 # target is now alert

Roles in ref paths (self, wielder, target) are resolved at execution time by the CascadeResolver, which maps role names to actual Entity objects based on context.

Visibility System

Entity sensors and details have three visibility levels:

visible — Always shown in scene prompts and tool output
hidden — Never shown to the AUT (internal state only)
contextual — Hidden until a reveal_when condition is met

Contextual Reveal Example metadata: visibility: poison_resistance: contextual reveal_when: poison_resistance: ref: pc.social.rel_guard.trust op: ">=" value: 0.7 # revealed when trust is high enough

After each choice, the DM evaluates all contextual reveal conditions across all entities. When a condition passes, the item becomes permanently visible. This lets campaigns model information the AUT must earn through social interaction or exploration — testing whether the AUT uses newly revealed information is a strong signal for memory and reasoning quality.

Future: Generative DM

Not Yet Implemented

The --dm flag with a goal string (e.g., maxim --dm "run a heist scenario") is planned but not yet built. It would use an architect persona to generate campaign YAML on the fly from a goal string, then hand off to the existing DM runtime for execution.

The generative DM would combine the structured testing benefits of hand-authored campaigns (expectations, dice, branches) with the flexibility of goal-driven generation. The architect would produce valid campaign YAML — validated by the same reachability/termination checks — and the DM runtime would execute it unchanged. This is blocked on Agent Mesh Phase 2 (the architect needs to be a mesh agent) and a DM Spike to validate the approach.

Architecture

Campaign YAML → load_campaign() → validate_campaign() → CampaignDef | DMRuntime(campaign, bridge, llm_router) ←---+ | +→ SceneState (entity tool register/deregister) +→ ChooseTool (dynamic per-encounter choices) +→ CascadeResolver (cross-entity reads/writes) +→ bridge.send_and_wait() → AUT processes stimulus +→ classify_choice() → follow branch → loop | check_expectations(hippo, nac, scn, pain_bus) | get_rollup() → report.json

Module	Purpose
simulation/dm_schema.py	Dataclasses, YAML loader, validator, dice roller, CascadeSpec, RevealCondition
simulation/dm_runtime.py	DMRuntime state machine, SceneState, CascadeResolver, choice classification
simulation/tools_dm.py	ChooseTool (dynamic per-encounter tool with fuzzy matching)
scenarios/campaigns/*.yaml	Campaign definitions (11 shipped)