Feature/distill token optimization #3181

Tomo1912 · 2026-01-05T14:27:25Z

Description

Add optional distill parameter to the fetch server that aggressively cleans HTML content to minimize token usage. When enabled, removes scripts, styles, navigation, headers, footers, ads, and other non-essential elements before conversion to markdown. Achieves 72.8% average token reduction across real-world tests.

Server Details

Server: fetch
Changes to: tools (added distill parameter to fetch tool)

Motivation and Context

LLM token costs are a significant operational expense. Current web-fetch returns full HTML including navigation menus, ads, scripts, and UI clutter - wasting tokens on non-content elements.

Test Results:

Website	Standard Tokens	Distilled Tokens	Reduction
MCP Docs	2,154	13	99.4%
TechCrunch	506	263	48.0%
Python Docs	662	629	5.0%
Average	1,107	302	72.8%

How Has This Been Tested?

Tested with Claude Desktop as MCP client
Tested against multiple real-world websites (documentation sites, news sites, technical docs)
Verified backward compatibility - existing calls without distill parameter work unchanged

Breaking Changes

None. The distill parameter defaults to False, maintaining full backward compatibility.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Protocol Documentation
My changes follows MCP security best practices
I have updated the server's README accordingly
I have tested this with an LLM client
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have documented all environment variables and configuration options

Additional context

No new dependencies required - uses existing readabilipy for content extraction.

Add distill parameter to aggressively clean HTML before processing: - Remove scripts, styles, navigation, headers, footers - Remove ads, sidebars, popups, cookie banners - Remove social widgets and non-content elements - Normalize whitespace Typical token reduction: 60-85% This is an opt-in feature (distill=false by default) to maintain backward compatibility.

Tomo1912 force-pushed the feature/distill-token-optimization branch 5 times, most recently from 4322b3e to a3cff79 Compare January 8, 2026 21:40

Tomo1912 force-pushed the feature/distill-token-optimization branch from a3cff79 to 98c0dd3 Compare January 8, 2026 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/distill token optimization #3181

Feature/distill token optimization #3181

Tomo1912 commented Jan 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature/distill token optimization #3181

Are you sure you want to change the base?

Feature/distill token optimization #3181

Conversation

Tomo1912 commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Server Details

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Tomo1912 commented Jan 5, 2026 •

edited

Loading