The open-source tool pxpipe converts long text inputs for Claude Code into compact PNGs to cut token costs.
The trick works because of how Anthropic prices images. Text costs roughly one token per character, but images cost a fixed number of tokens based on their pixel dimensions, no matter how much text they contain. Render dense content like code or JSON as an image, and you can pack about 3.1 characters into every image token.
pxpipe puts this into practice as a local proxy. It intercepts requests to Claude Code and renders the bulky, static parts as images, including system prompts, tool documentation, and older chat history. Recent messages and model outputs pass through as normal text. The image below shows what the model actually sees: Around 48,000 characters of system prompt and tool documentation get squeezed onto a single densely packed PNG page. As text, that would cost about 25,000 tokens. As an image, it's roughly 2,700.
According to developer Steven Chong, total savings average 59 to 70 percent. In one Fable 5 demo, session costs dropped from $42.21 to $6.06. If this somewhat exotic trick catches on, AI companies could respond by raising image processing prices.
The approach has downsides. It's lossy and exact strings like hashes can come back garbled when read from images. Processing is also slower since the model has to run the rendered images through a vision encoder instead of reading text directly.
By default, pxpipe supports Claude Fable 5 and GPT 5.6. Benchmarks and evaluations are documented in the repository. Fable 5 hits 100 percent accuracy in benchmarks on math problems with fresh random numbers the model can't have memorized. According to Chong, Opus 4.7 and 4.8 misread about 7 percent of the rendered images, and GPT 5.5 also does worse with image context. Both models are off by default and can only be enabled manually.
Feeding text to AI models as compressed images isn't a new idea. Deepseek built an OCR system that processes text documents as images and, according to its technical paper, compresses them by up to a factor of ten while keeping 97 percent of the information.