×
Yi-Coder is a Small But Mighty Open-Source LLM for Coding
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Yi-Coder, a new series of open-source code language models, has emerged as a powerful tool for developers, offering state-of-the-art coding performance with fewer than 10 billion parameters.

Model overview and key features: Yi-Coder is available in two sizes—1.5B and 9B parameters—with both base and chat versions designed for efficient inference and flexible training.

  • The models are built upon a foundation of 2.4 trillion high-quality tokens sourced from GitHub repositories and filtered code-related data from CommonCrawl.
  • Yi-Coder supports a maximum context window of 128K tokens, enabling project-level code comprehension and generation.
  • The 9B parameter version outperforms similar-sized models and even rivals some larger models like DeepSeek-Coder 33B in certain tasks.

Performance in coding benchmarks: Yi-Coder demonstrates impressive results across various coding challenges and benchmarks.

  • In the LiveCodeBench evaluation, Yi-Coder-9B-Chat achieved a 23.4% pass rate, surpassing models with significantly more parameters.
  • The model excelled in popular benchmarks like HumanEval (85.4% pass rate), MBPP (73.8% pass rate), and CRUXEval-O (over 50% accuracy).
  • Yi-Coder-9B consistently outperformed larger models in code editing tasks across debugging, translation, language switching, and code polishing.

Code completion and long-context modeling: Yi-Coder shows strong capabilities in cross-file code completion and long-context understanding.

  • The model outperformed similar-scale competitors in both retrieval and non-retrieval scenarios for Python and Java in the CrossCodeEval benchmark.
  • Yi-Coder-9B successfully completed the “Needle in the code” task, demonstrating its ability to extract key information from sequences up to 128K tokens long.

Mathematical reasoning prowess: Yi-Coder exhibits enhanced mathematical problem-solving abilities through programming.

  • When evaluated on seven math reasoning benchmarks using program-aided settings, Yi-Coder-9B achieved an impressive 70.3% average accuracy.
  • This performance surpassed the larger DeepSeek-Coder-33B model, which scored 65.8% in the same evaluation.

Broader implications for AI-powered development: Yi-Coder’s impressive performance in various coding tasks, despite its relatively small size, signals a potential shift in the landscape of AI-assisted software development.

  • The model’s ability to handle long contexts and excel in cross-file code completion could significantly enhance developer productivity in real-world scenarios.
  • Yi-Coder’s strong performance in mathematical reasoning through programming highlights the growing synergy between coding and problem-solving in AI models.
  • As part of the open-source Yi family, Yi-Coder’s accessibility may accelerate the adoption of AI-powered coding tools across the development community.
Meet Yi-Coder: A Small but Mighty LLM for Code

Recent News

North Korea unveils AI-equipped suicide drones amid deepening Russia ties

North Korea's AI-equipped suicide drones reflect growing technological cooperation with Russia, potentially destabilizing security in an already tense Korean peninsula.

Rookie mistake: Police recruit fired for using ChatGPT on academy essay finds second chance

A promising police career was derailed then revived after an officer's use of AI revealed gaps in how law enforcement is adapting to new technology.

Auburn University launches AI-focused cybersecurity center to counter emerging threats

Auburn's new center brings together experts from multiple disciplines to develop defensive strategies against the rising tide of AI-powered cyber threats affecting 78 percent of security officers surveyed.