×
AMD releases AMD-135M, its first open-source small language model
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

AMD’s Foray into Small Language Models: AMD has unveiled its first small language model (SLM), AMD-135M, marking a significant step in the company’s artificial intelligence initiatives.

  • AMD-135M is part of the Llama family of models and was trained from scratch on AMD Instinct™ MI250 accelerators.
  • The model comes in two variants: AMD-Llama-135M for general use and AMD-Llama-135M-code, which is fine-tuned for code-related tasks.
  • This release aligns with AMD’s commitment to an open approach to AI, aiming to foster inclusive, ethical, and innovative technological progress.

Training Process and Specifications: The development of AMD-135M involved substantial computational resources and time investment to create a capable small language model.

  • AMD-Llama-135M was trained on 670 billion tokens of general data over six days using four MI250 nodes.
  • The code-specific variant, AMD-Llama-135M-code, underwent additional fine-tuning with 20 billion tokens of code data, taking four days on the same hardware.
  • AMD has open-sourced the training code, dataset, and weights, enabling developers to reproduce the model and contribute to future SLM and LLM development.

Innovative Optimization Techniques: AMD has implemented speculative decoding to enhance the performance of its small language model, addressing key limitations in traditional language model inference.

  • Speculative decoding uses a small draft model to generate candidate tokens, which are then verified by the larger target model.
  • This approach allows for the generation of multiple tokens per forward pass, significantly reducing memory access consumption and improving inference speed.
  • The technique aims to overcome the limitations of traditional autoregressive approaches in large language models, which can only generate one token per forward pass.

Performance Improvements: Initial tests of AMD-135M demonstrate notable performance gains when used in conjunction with larger models.

  • AMD-Llama-135M-code was used as a draft model for CodeLlama-7b to test inference performance with and without speculative decoding.
  • Tests were conducted on both the MI250 accelerator for data centers and the Ryzen™ AI processor (with NPU) for AI PCs.
  • Significant speedups were observed on the Instinct MI250 accelerator, Ryzen AI CPU, and Ryzen AI NPU compared to inference without speculative decoding.

Implications for AI Development: The release of AMD-135M represents more than just a new model; it signifies AMD’s growing role in the AI ecosystem and its potential impact on future developments.

  • The model establishes an end-to-end workflow for both training and inferencing on select AMD platforms.
  • By open-sourcing the implementation, AMD is encouraging innovation and collaboration within the AI community.
  • This approach could lead to more rapid advancements in AI technology and potentially more diverse applications of small language models.

Future Outlook and Resources: AMD’s release of AMD-135M is accompanied by a suite of resources and opportunities for developers and researchers to engage with the technology.

  • A full technical blog post is available for those seeking more in-depth information about AMD-135M.
  • AMD has provided access to the code through its GitHub repository and model files via Hugging Face Model Card.
  • Developers can apply for access to Instinct accelerator cards on the AMD Developer Cloud, facilitating further experimentation and development.

Analyzing Deeper: While AMD’s entry into the small language model space is promising, its long-term impact remains to be seen. The success of AMD-135M could potentially challenge the dominance of larger tech companies in the AI model landscape, but it will depend on the model’s adoption rate and performance in real-world applications. Additionally, as AI technology continues to evolve rapidly, AMD will need to maintain a consistent pace of innovation to stay competitive in this fast-moving field.

AMD Unveils Its First Small Language Model AMD-135M

Recent News

H2O.ai boosts AI agent precision with advanced modeling

The platform integrates predictive analytics with generative AI to help businesses achieve more consistent and reliable AI outputs across their operations.

Salesforce launches testing center for AI agents

As AI agents proliferate across businesses, companies seek robust testing environments to validate autonomous systems before deployment in mission-critical operations.

Google’s Anthropic deal faces Justice Department scrutiny

U.S. regulators seek to restrict Google's ability to invest in AI startups, marking the first major government intervention in big tech's artificial intelligence deals.