Concurrency Locking

Session-level locking in the cowboy WASM plugin to prevent race conditions during LLM requests, tool execution, and summarization.

Status

Fully implemented in src/lock.rs. Integrated into the harness main loop, input handling, LLM dispatch, and tool result processing.

Design

Steering does NOT interrupt tool calls. User input queues and processes after the current tool completes. Summarization is gated like the Thinking state. Tool results are matched by unique call_id, not tool_name.

Core Types

SessionLock

Tracks all concurrency state for a session:

  • llm_active / llm_request_id -- whether an LLM request is in flight, with generation-counter IDs to detect stale responses
  • summarizing -- whether a summarization call is active (gates user input)
  • active_tools: HashMap<String, ActiveToolState> -- tool executions tracked by unique call_id
  • input_queue: VecDeque<QueuedInput> -- queued user inputs waiting to be processed
  • queue_mode: QueueMode -- current queuing strategy
  • generation: u64 -- monotonic counter incremented on each LLM acquire

QueueMode

Three modes for handling user input that arrives while the session is locked:

  • Collect (default): Coalesce multiple inputs into a single message
  • Followup: Process one input at a time after unlock
  • Steer: Inject input at the next tool boundary (between tool calls)

ActiveToolState

Per-tool tracking with call_id, tool_name, started_at, and timeout. Supports timeout detection via check_timeouts().

Integration Points

  1. Input handling (input.rs): When session_lock.is_locked(), user input is queued instead of processed immediately
  2. LLM dispatch (llm.rs): try_acquire_llm() prevents concurrent LLM requests
  3. Tool execution (harness.rs): register_tool() on dispatch, complete_tool() on result
  4. Boundary detection (harness.rs): After tool completion, boundary_reached() checks if queued input should be injected
  5. Summarization gating (handlers.rs): set_summarizing(true/false) gates input during summarization API calls
  6. Status display (harness.rs): Lock state shown in the status bar with queued input count

Key Behaviors

  • is_locked() returns true if any of: LLM active, tools active, summarization active
  • at_tool_boundary() returns true only when all three are inactive
  • drain_ready_inputs() respects the current QueueMode
  • Stale LLM responses are rejected via request ID matching
  • Timed-out tools are automatically cleaned up on heartbeat ticks