Result Summarization

Intelligent summarization of large tool outputs in the cowboy WASM plugin. Uses a cheap model (Haiku/GPT-4.1-mini) instead of blunt truncation, with type-aware prompts and fallback truncation.

Status

Fully implemented in src/summarize.rs. Integrated into the harness tool result pipeline and connected to the session lock for input gating during summarization.

Thresholds

DEFAULT_SUMMARIZE_THRESHOLD: 1500 characters (triggers summarization)
MAX_SUMMARY_LENGTH: 800 characters
MAX_SUMMARIZABLE_LENGTH: 50000 characters (caps input to summarization model)

Providers

Supports both Anthropic and OpenAI for summarization:

Anthropic: claude-haiku-4-5-20251001 via Messages API
OpenAI: gpt-4.1-mini via Chat Completions API

Provider is configurable independently of the main LLM provider. Falls back to main provider if no explicit summary provider is set.

Tool Output Types

Six categories with type-specific summarization prompts and truncation ratios:

Type	Tool Names	Head Ratio	Prompt Focus
FileContent	read, cat	0.7	Structure, hashlines for relevant section
SearchResults	grep, rg, search	0.5	File paths with line numbers, grouped
DirectoryListing	ls, find, fd	0.3	Grouped by type, key files
CommandOutput	bash, sh	0.6	Preserve errors verbatim
StructuredData	nix-search, gh	0.5	Names, versions, key fields
WebContent	web-search, web-fetch	0.5	Main facts, relevant quotes

Code File Handling

For FileContent, the summarization request is sectioned:

Before section: lines above the relevant range (summarized)
Relevant section: formatted with hashlines (preserved exactly)
After section: lines below the relevant range (summarized)

Default relevant range is the middle third of the file.

Pipeline

Tool result arrives in handle_command_result()
Full output is always logged to session JSONL (no truncation)
If summarizer.needs_summarization() returns true:
- Create PendingSummary with full output and type
- Format API request via summarizer.format_request()
- Set session_lock.set_summarizing(true) (gates user input)
- Fire web request to summarization API
On response in handle_summarization_response():
- Release summarization lock
- Parse provider-specific response
- On failure: fallback_truncate() with type-aware head/tail ratio
- Add summarized result to conversation context

Fallback Truncation

When summarization API is unavailable, fallback_truncate() uses smart head/tail splitting:

Respects line boundaries
Uses type-specific head ratios (e.g., 70% head for code, 30% for directory listings)
Inserts [...N chars omitted...] marker
Safe for UTF-8 (truncates at char boundaries)

Cowboy