Home / Business / Small Business / My Claude Code sessions were eating 150k tokens per task, so I built something about it

Small Business

My Claude Code sessions were eating 150k tokens per task, so I built something about it

By bdadmin

1 Comment

17 April 2026 21:07

Introducing Token Reducer: A Solution to Optimize Code Context for Claude Code Users

In recent weeks, I have been dedicated to addressing an issue that has impacted many developers working with advanced AI tools: the inefficiencies in handling large code repositories. When utilizing Claude Code on medium to large projects, I noticed that the context window would rapidly fill with irrelevant information, such as unnecessary file contents, unused imports, and extraneous boilerplate code. This led me to ponder an important question: how can we intelligently compress context before it is processed by Claude?

The Solution: Token Reducer

To tackle this challenge, I developed Token Reducer, a plugin for Claude Code designed to optimize repository context locally. This tool processes your project’s context before sending any data to the cloud, ensuring that your information remains secure and within your control. Here’s a breakdown of how Token Reducer works:

AST-Based Chunking: Rather than using simple text splitting, Token Reducer parses your code into meaningful units, such as functions, classes, and code blocks.
Hybrid Retrieval: The plugin utilizes a combination of BM25 keyword matching and vector similarity techniques to identify the most relevant chunks related to your current task.
TextRank Compression: An extractive summarization method is applied to retain significant parts of the code while discarding irrelevant noise, streamlining what is sent to Claude.
Import Graph Mapping: Token Reducer traces code dependencies, ensuring that related pieces of code remain grouped together, which is crucial for maintaining context.
2-Hop Symbol Expansion: If you’re working with a function that calls another, Token Reducer automatically includes the context of the called function, simplifying your coding workflow.

In tests across various programming languages, including Python, TypeScript, and JavaScript, I have observed a remarkable reduction in context size—between 90% and 98%—without losing the vital code necessary for tasks at hand.

Development Process

The development of Token Reducer was an iterative process that relied heavily on feedback from Claude itself. I started with a basic chunking mechanism and refined it through extensive testing against real coding scenarios. The end result is a reliable tool that enhances productivity and preserves context.

Get Started with Token Reducer

Token Reducer is available for free under an MIT license. You can add it to your toolkit through the plugin marketplace, or access the source code on GitHub:

Plugin Marketplace: /plugin marketplace add Madhan230205/token-reducer
GitHub Repository: Madhan230205/token-reducer

Feedback and Contributions

As Token Reducer is still in its early stages, I welcome any feedback from users. Please share your experience regarding:

How the compression affected your workflow.
Instances where important context might have been lost.
Recommendations for better handling of specific languages or repository structures.

If you’re interested in contributing to the project, the repository is open for improvements. There is significant potential to optimize for various languages, enhance caching strategies, and refine retrieval parameters.

Thank you for your interest, and I look forward to your thoughts and questions in the comments!

Author: bdadmin

bdadmin

One Comment

bdadmin
23 June 2026 at 22:28

Reply

This development is a substantial step forward in managing the inherent challenges of working with large codebases and AI-assisted development tools. By intelligently pruning irrelevant or redundant information—especially through AST-based chunking and dependency-aware grouping—Token Reducer effectively preserves essential context while dramatically reducing token consumption.

This approach aligns well with broader trends in optimizing prompt engineering and context management, particularly as models with limited token windows continue to be prevalent. The combination of hybrid retrieval techniques and Graph Mapping also reflects a nuanced understanding of code semantics, which is often overlooked in more naive summarization methods.

One area ripe for further exploration could involve integrating adaptive heuristics that dynamically adjust the granularity of context reduction based on the current task complexity or user feedback. Additionally, incorporating language-specific optimizations might enhance accuracy, given syntactic and structural differences across languages like Python versus JavaScript.

Overall, tools like Token Reducer exemplify how intelligent preprocessing can extend the utility of AI coding assistants, not by replacing their capabilities but by enabling more focused, efficient interactions within their operational constraints. This is an exciting development for the future of AI-driven software development workflows.

My Claude Code sessions were eating 150k tokens per task, so I built something about it