DeepSeek has developed a novel approach to AI memory that stores information as images rather than text tokens, potentially solving the “context rot” problem that causes AI models to forget earlier parts of long conversations. The Chinese AI company’s optical character recognition (OCR) model demonstrates how visual tokens can pack more information into AI systems while using significantly fewer computational resources.
What you should know: DeepSeek’s OCR model performs comparably to leading systems on key benchmarks, but its real innovation lies in how it processes and stores information.
- Instead of breaking text into thousands of small tokens like traditional language models, the system converts written information into image form—essentially taking pictures of text pages.
- This visual approach allows the model to retain nearly the same information while using far fewer tokens, reducing computational costs and memory requirements.
- The model incorporates tiered compression similar to human memory, where older or less critical content is stored in slightly blurred form to save space while remaining accessible.
Why this matters: The breakthrough could significantly reduce AI’s computational demands and carbon footprint while solving persistent memory problems in long conversations.
- Current AI models suffer from “context rot”—forgetting earlier information as conversations grow longer due to expensive token storage and processing.
- DeepSeek’s method could enable more efficient AI agents that remember context better and assist users more effectively over extended interactions.
- The technique addresses the severe shortage of quality training data by generating over 200,000 pages of training data daily on a single GPU.
What experts are saying: Industry leaders are taking notice of this unconventional approach to AI architecture.
- “Images may ultimately be better than text as inputs for LLMs,” wrote Andrej Karpathy, former Tesla AI chief and OpenAI founding member, calling text tokens “wasteful and just terrible at the input.”
- “While the idea of using image-based tokens for context storage isn’t entirely new, this is the first study I’ve seen that takes it this far and shows it might actually work,” said Manling Li, assistant professor of computer science at Northwestern University.
The big picture: DeepSeek continues to challenge Western AI dominance with resource-efficient innovations, following its earlier release of DeepSeek-R1, an open-source reasoning model that rivaled leading systems despite using far fewer computing resources.
What’s next: Researchers see potential for visual tokens to extend beyond memory storage into reasoning capabilities, with future work exploring more dynamic memory systems that mirror human recall patterns—remembering life-changing moments while forgetting mundane details like yesterday’s lunch.
DeepSeek may have found a new way to improve AI’s ability to remember