Result Summarization

Intelligent summarization of large tool outputs in the cowboy WASM plugin. Uses a cheap model (Haiku/GPT-4.1-mini) instead of blunt truncation, with type-aware prompts and fallback truncation.

Status

Fully implemented in src/summarize.rs. Integrated into the harness tool result pipeline and connected to the session lock for input gating during summarization.

Thresholds

  • DEFAULT_SUMMARIZE_THRESHOLD: 1500 characters (triggers summarization)
  • MAX_SUMMARY_LENGTH: 800 characters
  • MAX_SUMMARIZABLE_LENGTH: 50000 characters (caps input to summarization model)

Providers

Supports both Anthropic and OpenAI for summarization:

  • Anthropic: claude-haiku-4-5-20251001 via Messages API
  • OpenAI: gpt-4.1-mini via Chat Completions API

Provider is configurable independently of the main LLM provider. Falls back to main provider if no explicit summary provider is set.

Tool Output Types

Six categories with type-specific summarization prompts and truncation ratios:

TypeTool NamesHead RatioPrompt Focus
FileContentread, cat0.7Structure, hashlines for relevant section
SearchResultsgrep, rg, search0.5File paths with line numbers, grouped
DirectoryListingls, find, fd0.3Grouped by type, key files
CommandOutputbash, sh0.6Preserve errors verbatim
StructuredDatanix-search, gh0.5Names, versions, key fields
WebContentweb-search, web-fetch0.5Main facts, relevant quotes

Code File Handling

For FileContent, the summarization request is sectioned:

  • Before section: lines above the relevant range (summarized)
  • Relevant section: formatted with hashlines (preserved exactly)
  • After section: lines below the relevant range (summarized)

Default relevant range is the middle third of the file.

Pipeline

  1. Tool result arrives in handle_command_result()
  2. Full output is always logged to session JSONL (no truncation)
  3. If summarizer.needs_summarization() returns true:
    • Create PendingSummary with full output and type
    • Format API request via summarizer.format_request()
    • Set session_lock.set_summarizing(true) (gates user input)
    • Fire web request to summarization API
  4. On response in handle_summarization_response():
    • Release summarization lock
    • Parse provider-specific response
    • On failure: fallback_truncate() with type-aware head/tail ratio
    • Add summarized result to conversation context

Fallback Truncation

When summarization API is unavailable, fallback_truncate() uses smart head/tail splitting:

  • Respects line boundaries
  • Uses type-specific head ratios (e.g., 70% head for code, 30% for directory listings)
  • Inserts [...N chars omitted...] marker
  • Safe for UTF-8 (truncates at char boundaries)