×

What does it do?

  • Code Generation
  • Programming Assistance
  • Large Language Models
  • GitHub Integration
  • Technical Assistance

How is it used?

Details & Features

  • Made By

    HuggingFace
  • Released On

    2016-07-10

StarCoder and StarCoderBase are Large Language Models for Code (Code LLMs) that have been developed using permissively licensed data from GitHub. The data used for training these models encompasses information from over 80 programming languages, Git commits, GitHub issues, and Jupyter notebooks. A model with approximately 15 billion parameters was trained on 1 trillion tokens, similar to the LLaMA model. The StarCoderBase model was then specifically fine-tuned for 35 billion Python tokens, leading to the development of a new model known as StarCoder.

Research has shown that StarCoderBase performs better than existing open Code LLMs on popular programming benchmarks. It also matches or surpasses the performance of closed models like code-cushman-001 from OpenAI, which was the original Codex model used in early versions of GitHub Copilot.

The StarCoder models have several key features:
- They can process a context length of over 8,000 tokens, which is more than any other open LLM available.
- They can be used in a variety of applications. For instance, they can act as a technical assistant when provided with a series of dialogues as prompts.

Developers seeking a powerful LLM for code generation may find StarCoder to be a useful tool.

  • Supported ecosystems
    Hugging Face
  • What does it do?
    Code Generation, Programming Assistance, Large Language Models, GitHub Integration, Technical Assistance
  • Who is it good for?
    -

Alternatives

GitHub Copilot generates code suggestions in real-time to enhance developer productivity.
Augment is an AI-powered coding assistant that enhances software development efficiency and quality.
Amazon Q Developer is an AI-powered coding assistant that enhances software development and infrastructure management for AWS developers.
Gemini Code Assist is an AI coding assistant that boosts developer productivity with code completions and generation.
PolyCoder generates code snippets and assists with code understanding across 12 programming languages.
CodeT5 and CodeT5+ are open-source language models that automate coding tasks for developers.
Bloop converts legacy COBOL code into modern, readable Java code with identical behavior using AI.
Cody is an AI coding assistant that enhances developer productivity by providing advanced code search, understanding, and generation capabilities.
CodeLlama-34b-Instruct-hf is an AI model for code synthesis and understanding, ideal for developers and researchers.
EasyCode is an AI-powered coding assistant that provides context-aware suggestions to enhance developer productivity.