×
Written by
Published on
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Yi-Coder, a new series of open-source code language models, has emerged as a powerful tool for developers, offering state-of-the-art coding performance with fewer than 10 billion parameters.

Model overview and key features: Yi-Coder is available in two sizes—1.5B and 9B parameters—with both base and chat versions designed for efficient inference and flexible training.

  • The models are built upon a foundation of 2.4 trillion high-quality tokens sourced from GitHub repositories and filtered code-related data from CommonCrawl.
  • Yi-Coder supports a maximum context window of 128K tokens, enabling project-level code comprehension and generation.
  • The 9B parameter version outperforms similar-sized models and even rivals some larger models like DeepSeek-Coder 33B in certain tasks.

Performance in coding benchmarks: Yi-Coder demonstrates impressive results across various coding challenges and benchmarks.

  • In the LiveCodeBench evaluation, Yi-Coder-9B-Chat achieved a 23.4% pass rate, surpassing models with significantly more parameters.
  • The model excelled in popular benchmarks like HumanEval (85.4% pass rate), MBPP (73.8% pass rate), and CRUXEval-O (over 50% accuracy).
  • Yi-Coder-9B consistently outperformed larger models in code editing tasks across debugging, translation, language switching, and code polishing.

Code completion and long-context modeling: Yi-Coder shows strong capabilities in cross-file code completion and long-context understanding.

  • The model outperformed similar-scale competitors in both retrieval and non-retrieval scenarios for Python and Java in the CrossCodeEval benchmark.
  • Yi-Coder-9B successfully completed the “Needle in the code” task, demonstrating its ability to extract key information from sequences up to 128K tokens long.

Mathematical reasoning prowess: Yi-Coder exhibits enhanced mathematical problem-solving abilities through programming.

  • When evaluated on seven math reasoning benchmarks using program-aided settings, Yi-Coder-9B achieved an impressive 70.3% average accuracy.
  • This performance surpassed the larger DeepSeek-Coder-33B model, which scored 65.8% in the same evaluation.

Broader implications for AI-powered development: Yi-Coder’s impressive performance in various coding tasks, despite its relatively small size, signals a potential shift in the landscape of AI-assisted software development.

  • The model’s ability to handle long contexts and excel in cross-file code completion could significantly enhance developer productivity in real-world scenarios.
  • Yi-Coder’s strong performance in mathematical reasoning through programming highlights the growing synergy between coding and problem-solving in AI models.
  • As part of the open-source Yi family, Yi-Coder’s accessibility may accelerate the adoption of AI-powered coding tools across the development community.
Meet Yi-Coder: A Small but Mighty LLM for Code

Recent News

New AI Tools Can Now Predict Severe RSV Cases in Children

New machine learning models aim to predict which children are most at risk for severe RSV infections, potentially improving prevention and treatment strategies.

How to Use Pixel Studio to Generate AI Images on the Google Pixel 9

Google's Pixel 9 introduces AI-powered image creation through the Pixel Studio app, enabling users to generate custom visuals from text prompts and edit existing photos.

AI’s Insatiable Need for Energy is Presenting Big Investment Opportunities

The rapid expansion of AI-driven data centers is straining US power infrastructure, requiring over $500 billion in investments and potentially consuming 12% of national electricity by 2030.