Yi-Coder is a Small But Mighty Open-Source LLM for Coding

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

Yi-Coder, a new series of open-source code language models, has emerged as a powerful tool for developers, offering state-of-the-art coding performance with fewer than 10 billion parameters.

Model overview and key features: Yi-Coder is available in two sizes—1.5B and 9B parameters—with both base and chat versions designed for efficient inference and flexible training.

The models are built upon a foundation of 2.4 trillion high-quality tokens sourced from GitHub repositories and filtered code-related data from CommonCrawl.
Yi-Coder supports a maximum context window of 128K tokens, enabling project-level code comprehension and generation.
The 9B parameter version outperforms similar-sized models and even rivals some larger models like DeepSeek-Coder 33B in certain tasks.

Performance in coding benchmarks: Yi-Coder demonstrates impressive results across various coding challenges and benchmarks.

In the LiveCodeBench evaluation, Yi-Coder-9B-Chat achieved a 23.4% pass rate, surpassing models with significantly more parameters.
The model excelled in popular benchmarks like HumanEval (85.4% pass rate), MBPP (73.8% pass rate), and CRUXEval-O (over 50% accuracy).
Yi-Coder-9B consistently outperformed larger models in code editing tasks across debugging, translation, language switching, and code polishing.

Code completion and long-context modeling: Yi-Coder shows strong capabilities in cross-file code completion and long-context understanding.

The model outperformed similar-scale competitors in both retrieval and non-retrieval scenarios for Python and Java in the CrossCodeEval benchmark.
Yi-Coder-9B successfully completed the “Needle in the code” task, demonstrating its ability to extract key information from sequences up to 128K tokens long.

Mathematical reasoning prowess: Yi-Coder exhibits enhanced mathematical problem-solving abilities through programming.

When evaluated on seven math reasoning benchmarks using program-aided settings, Yi-Coder-9B achieved an impressive 70.3% average accuracy.
This performance surpassed the larger DeepSeek-Coder-33B model, which scored 65.8% in the same evaluation.

Broader implications for AI-powered development: Yi-Coder’s impressive performance in various coding tasks, despite its relatively small size, signals a potential shift in the landscape of AI-assisted software development.

The model’s ability to handle long contexts and excel in cross-file code completion could significantly enhance developer productivity in real-world scenarios.
Yi-Coder’s strong performance in mathematical reasoning through programming highlights the growing synergy between coding and problem-solving in AI models.
As part of the open-source Yi family, Yi-Coder’s accessibility may accelerate the adoption of AI-powered coding tools across the development community.

Meet Yi-Coder: A Small but Mighty LLM for Code

huggingface

Menu

Yi-Coder is a Small But Mighty Open-Source LLM for Coding

Recent News

Meta hires ChatGPT co-creator as chief scientist for $14B AI push

AI models secretly inherit harmful traits through sterile training data

Be concrete, and 4 other lessons from successful enterprise AI implementations

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

Yi-Coder is a Small But Mighty Open-Source LLM for Coding

Recent News

Meta hires ChatGPT co-creator as chief scientist for $14B AI push

AI models secretly inherit harmful traits through sterile training data

Be concrete, and 4 other lessons from successful enterprise AI implementations

Join the revolution

CO/AI

Resources

Join the revolution