×
xAI Doubles Grok-2 Speed with Innovative Code Rewrite
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Grok-2 and Grok-2 mini receive significant speed boost: xAI, Elon Musk’s artificial intelligence company, has dramatically improved the performance of its large language models through a complete rewrite of the inference code stack.

  • Two xAI developers, Lianmin Zheng and Saeed Maleki, rewrote the inference stack in just three days using SGLang, an open-source system for executing complex language model programs.
  • The update resulted in Grok-2 mini becoming twice as fast as its previous version, while also enabling the larger Grok-2 model to run at a reasonable speed across multiple hosts.
  • Both models experienced slight improvements in accuracy alongside their speed enhancements.

Technical details of the upgrade: The developers leveraged SGLang, a highly efficient system for language model applications, to achieve the performance boost.

  • SGLang, developed by researchers from several universities, offers up to 6.4 times higher throughput compared to existing systems.
  • The system supports various models, including Llama, Mistral, and LLaVA, and is compatible with both open-weight and API-based models like OpenAI’s GPT-4.
  • SGLang optimizes execution through automatic cache reuse and parallelism within a single program, making it particularly useful for large-scale language models.

Performance rankings and capabilities: The recent upgrades have significantly improved Grok-2’s standing in third-party performance evaluations.

  • Grok-2 has secured the second position on the Lmsys Chatbot Arena leaderboard with an Arena Score of 1293, based on 6686 votes.
  • This places Grok-2 in a tie with Google’s Gemini-1.5 Pro model, just behind OpenAI’s latest version of ChatGPT-4o.
  • Grok-2 mini has also climbed to the fifth position with an Arena Score of 1268 from 7266 votes.
  • The main Grok-2 model particularly excels in mathematical tasks, where it ranks first, and performs strongly in categories such as Hard Prompts, Coding, and Instruction-following.

Accessibility and future developments: xAI is making these advanced AI models available to users through a subscription model while promising further improvements.

  • Both Grok-2 and Grok-2 mini are accessible through an $8 USD monthly subscription on the social network X.
  • xAI developer Igor Babuschkin has indicated that the main advantage of Grok-2 mini over the full Grok-2 model is its enhanced speed.
  • The company has pledged to continue improving the processing speed of Grok-2 mini, potentially making it an even more attractive option for users seeking high performance with lower computational requirements.

Implications for the AI landscape: The rapid development and improvement of Grok-2 and Grok-2 mini highlight the intense competition and rapid pace of innovation in the AI field.

  • The success of these models demonstrates xAI’s ability to quickly iterate and improve its technology, potentially challenging more established players in the AI space.
  • The use of open-source tools like SGLang in developing proprietary models showcases the importance of collaborative efforts in advancing AI capabilities.
  • As xAI continues to refine its models, the AI community can expect further advancements in both speed and accuracy, potentially reshaping the competitive landscape of large language models.
Grok-2 gets a speed bump after developers rewrite code in three days

Recent News

Chinese operators deploy AI-powered RAN at massive scale

Chinese telecom giants implement Level 4 network automation, achieving over 10% energy savings and operational efficiencies that outpace Western competitors.

SUNY receives major funding boost in New York state budget

SUNY's expanded AI initiative aims to position New York as a leader in responsible AI development while preparing students across the system for an increasingly AI-driven workforce.

Google AI creates lifelike Will Smith double eating virtual spaghetti

Google's latest AI video generator adds synchronized audio to create more realistic synthetic media, demonstrated by a slightly uncanny Will Smith who crunches rather than slurps his virtual pasta.