Sub-Agents

Multi-pane orchestration using Zellij for full observability. Sub-agents are spawned as separate WASM plugin instances with file-based communication.

Design

Each sub-agent runs in its own Zellij pane as a separate cowboy plugin instance. The parent spawns via zellij action launch-plugin, passing configuration through a serialized config string. Communication is entirely file-based.

Principles

  1. Full observability -- each sub-agent is a visible Zellij pane
  2. File-based communication -- all state lives on disk (prompt.md, response.md, lock)
  3. Recovery -- scan_existing() re-discovers sub-agent state from disk after crash
  4. Optional context inheritance -- via context parameter: "none" (fresh context, default), "summary" (summarizer-produced session summary), or "last_n" (system prompt + last 10 turns) -- configured with cheap models by default

Directory Structure

/home/agent/subagents/<type>-<uuid6>/
  prompt.md       # Task from parent (with YAML frontmatter: type, id, allowed_tools)
  conscious.md    # Working context (YAML frontmatter + markdown body)
  response.md     # Final output (written by sub-agent, triggers completion)
  lock            # Timestamp lock (exists while running)
  tools.json      # Filtered tool manifest for this sub-agent type

File Protocol

FileWritten ByRead BySemantics
prompt.mdParentSub-agentInitial task; loaded on first timer tick
tools.jsonParentSub-agentSubset of parent's tool manifest
conscious.mdSub-agentParent/observersLive frontmatter: status, tokens_used, current_task, progress
response.mdSub-agentParentFinal result; presence + lock removal = completion
lockSub-agentParentLiveness signal; removal without response.md = failure

Sub-Agent Types

Defined in SubAgentType enum. Each type restricts tool access:

TypeToolsDefault Model
Researchread, search, find, web-search, lsgpt-4o-mini / claude-3-5-haiku
Coderead, write, search, find, bash, ls, hashline-read, hashline-editgpt-4o-mini / claude-3-5-haiku
Reviewread, search, find, bash, lsgpt-4o-mini / claude-3-5-haiku

Cost optimization: SubagentConfig::cheap_defaults() prefers OpenAI gpt-4o-mini, falls back to Anthropic haiku.

Lifecycle

  1. Parent calls spawn_subagent(task, type) -- creates SpawnCommand
  2. Parent writes prompt.md and tools.json via run_sh (WASM can't write directly)
  3. Parent launches plugin pane via zellij action launch-plugin with serialized config
  4. Sub-agent loads -- on first timer tick, reads prompt.md + tools.json via run_command, writes lock
  5. Sub-agent works -- processes prompt as user message, updates conscious.md
  6. Sub-agent completes -- calls submit_response tool, which writes response.md, removes lock, closes pane
  7. Parent detects completion via heartbeat polling (check_subagent_status) and reads response.md
  8. Parent injects result as [Subagent Result: <id>] user message, continues conversation

Status Detection

Parent polls via shell command that checks all subdirectories:

  • lock exists -> Running
  • lock gone + response.md exists -> Completed
  • lock gone + no response.md -> Failed

Polling happens on heartbeat timer (every ~30s idle, ~0.3s active).

Recovery

SubAgentManager::scan_existing() walks the base directory, discovers sub-agents by presence of prompt.md, infers type from ID prefix, and determines status from lock/response files. This enables recovery after harness crash.

Cleanup

cleanup_completed() removes sub-agent directories older than 1 hour (CLEANUP_AGE_SECS = 3600).

Not Yet Implemented

  • history/ directory for compaction snapshots (design doc mentioned, not in code)
  • Sibling communication via inbox directories
  • Tee-style buffering (ring buffer + async writer) -- infeasible in WASM/WASI environment
  • PID-based liveness -- lock file uses timestamp instead (std::process::id unsupported in WASI)