×
Yi-Coder is a Small But Mighty Open-Source LLM for Coding
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Yi-Coder, a new series of open-source code language models, has emerged as a powerful tool for developers, offering state-of-the-art coding performance with fewer than 10 billion parameters.

Model overview and key features: Yi-Coder is available in two sizes—1.5B and 9B parameters—with both base and chat versions designed for efficient inference and flexible training.

  • The models are built upon a foundation of 2.4 trillion high-quality tokens sourced from GitHub repositories and filtered code-related data from CommonCrawl.
  • Yi-Coder supports a maximum context window of 128K tokens, enabling project-level code comprehension and generation.
  • The 9B parameter version outperforms similar-sized models and even rivals some larger models like DeepSeek-Coder 33B in certain tasks.

Performance in coding benchmarks: Yi-Coder demonstrates impressive results across various coding challenges and benchmarks.

  • In the LiveCodeBench evaluation, Yi-Coder-9B-Chat achieved a 23.4% pass rate, surpassing models with significantly more parameters.
  • The model excelled in popular benchmarks like HumanEval (85.4% pass rate), MBPP (73.8% pass rate), and CRUXEval-O (over 50% accuracy).
  • Yi-Coder-9B consistently outperformed larger models in code editing tasks across debugging, translation, language switching, and code polishing.

Code completion and long-context modeling: Yi-Coder shows strong capabilities in cross-file code completion and long-context understanding.

  • The model outperformed similar-scale competitors in both retrieval and non-retrieval scenarios for Python and Java in the CrossCodeEval benchmark.
  • Yi-Coder-9B successfully completed the “Needle in the code” task, demonstrating its ability to extract key information from sequences up to 128K tokens long.

Mathematical reasoning prowess: Yi-Coder exhibits enhanced mathematical problem-solving abilities through programming.

  • When evaluated on seven math reasoning benchmarks using program-aided settings, Yi-Coder-9B achieved an impressive 70.3% average accuracy.
  • This performance surpassed the larger DeepSeek-Coder-33B model, which scored 65.8% in the same evaluation.

Broader implications for AI-powered development: Yi-Coder’s impressive performance in various coding tasks, despite its relatively small size, signals a potential shift in the landscape of AI-assisted software development.

  • The model’s ability to handle long contexts and excel in cross-file code completion could significantly enhance developer productivity in real-world scenarios.
  • Yi-Coder’s strong performance in mathematical reasoning through programming highlights the growing synergy between coding and problem-solving in AI models.
  • As part of the open-source Yi family, Yi-Coder’s accessibility may accelerate the adoption of AI-powered coding tools across the development community.
Meet Yi-Coder: A Small but Mighty LLM for Code

Recent News

Nvidia’s new AI agents can search and summarize huge quantities of visual data

NVIDIA's new AI Blueprint combines computer vision and generative AI to enable efficient analysis of video and image content, with potential applications across industries and smart city initiatives.

How Boulder schools balance AI innovation with student data protection

Colorado school districts embrace AI in classrooms, focusing on ethical use and data privacy while preparing students for a tech-driven future.

Microsoft Copilot Vision nears launch — here’s what we know right now

Microsoft's new AI feature can analyze on-screen content, offering contextual assistance without the need for additional searches or explanations.