Result Summarization
Intelligent summarization of large tool outputs in the cowboy WASM plugin. Uses a cheap model (Haiku/GPT-4.1-mini) instead of blunt truncation, with type-aware prompts and fallback truncation.
Status
Fully implemented in src/summarize.rs. Integrated into the harness tool result pipeline and connected to the session lock for input gating during summarization.
Thresholds
DEFAULT_SUMMARIZE_THRESHOLD: 1500 characters (triggers summarization)MAX_SUMMARY_LENGTH: 800 charactersMAX_SUMMARIZABLE_LENGTH: 50000 characters (caps input to summarization model)
Providers
Supports both Anthropic and OpenAI for summarization:
- Anthropic:
claude-haiku-4-5-20251001via Messages API - OpenAI:
gpt-4.1-minivia Chat Completions API
Provider is configurable independently of the main LLM provider. Falls back to main provider if no explicit summary provider is set.
Tool Output Types
Six categories with type-specific summarization prompts and truncation ratios:
| Type | Tool Names | Head Ratio | Prompt Focus |
|---|---|---|---|
| FileContent | read, cat | 0.7 | Structure, hashlines for relevant section |
| SearchResults | grep, rg, search | 0.5 | File paths with line numbers, grouped |
| DirectoryListing | ls, find, fd | 0.3 | Grouped by type, key files |
| CommandOutput | bash, sh | 0.6 | Preserve errors verbatim |
| StructuredData | nix-search, gh | 0.5 | Names, versions, key fields |
| WebContent | web-search, web-fetch | 0.5 | Main facts, relevant quotes |
Code File Handling
For FileContent, the summarization request is sectioned:
- Before section: lines above the relevant range (summarized)
- Relevant section: formatted with hashlines (preserved exactly)
- After section: lines below the relevant range (summarized)
Default relevant range is the middle third of the file.
Pipeline
- Tool result arrives in
handle_command_result() - Full output is always logged to session JSONL (no truncation)
- If
summarizer.needs_summarization()returns true:- Create
PendingSummarywith full output and type - Format API request via
summarizer.format_request() - Set
session_lock.set_summarizing(true)(gates user input) - Fire web request to summarization API
- Create
- On response in
handle_summarization_response():- Release summarization lock
- Parse provider-specific response
- On failure:
fallback_truncate()with type-aware head/tail ratio - Add summarized result to conversation context
Fallback Truncation
When summarization API is unavailable, fallback_truncate() uses smart head/tail splitting:
- Respects line boundaries
- Uses type-specific head ratios (e.g., 70% head for code, 30% for directory listings)
- Inserts
[...N chars omitted...]marker - Safe for UTF-8 (truncates at char boundaries)