×
Sakana’s new AI model framework could be key to unlocking multi-agent systems
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Sakana AI has introduced CycleQD, a groundbreaking framework that enables efficient creation of specialized language models through evolutionary computing techniques, offering a sustainable alternative to traditional large model training.

The innovation in brief: CycleQD employs evolutionary algorithms to combine skills from different language models without requiring expensive training processes.

  • The framework creates “swarms” of task-specific AI models that can specialize in different skills while using fewer computational resources
  • This approach marks a shift from the conventional method of training increasingly larger models to handle multiple tasks
  • The technique draws inspiration from quality diversity (QD), an evolutionary computing concept that focuses on creating diverse solutions from initial populations

Technical implementation: CycleQD integrates evolutionary principles into the post-training pipeline of Large Language Models (LLMs) to develop new skill combinations.

  • The system uses “crossover” operations to merge characteristics from different parent models
  • “Mutation” operations, based on singular value decomposition (SVD), help explore new capabilities beyond the original models
  • Each skill is treated as a behavior characteristic that subsequent generations of models are optimized to perform

Performance metrics: Testing of CycleQD demonstrated superior results compared to traditional methods when applied to Llama 3-8B expert models.

  • The framework successfully combined skills across coding, database operations, and operating system operations
  • Models created through CycleQD showed better performance than those developed through conventional fine-tuning
  • The process proved more efficient and cost-effective than traditional training methods

Future applications: Sakana AI researchers envision broader implications for AI development and deployment.

  • CycleQD could enable continuous learning and adaptation in AI systems without the need for complete retraining
  • The technology shows promise for developing collaborative multi-agent systems
  • The framework could help create specialized AI agents that work together to solve complex problems

Critical considerations: While CycleQD shows promising results, several important questions remain about its scalability and real-world implementation.

  • The long-term effectiveness of evolutionary approaches compared to traditional training methods needs further study
  • The practical limitations of managing swarms of specialized models versus single large models require additional research
  • The trade-offs between model specialization and generalization capabilities warrant deeper investigation
Sakana AI’s CycleQD outperforms traditional fine-tuning methods for multi-skill language models

Recent News

Apple’s cheapest iPad is bad for AI

Apple's budget tablet lacks sufficient RAM to run upcoming AI features, widening the gap with pricier models in the lineup.

Mira Murati’s AI venture recruits ex-OpenAI leader among first hires

Former OpenAI exec's new AI startup lures top talent and seeks $100 million in early funding.

Microsoft is cracking down on malicious actors who bypass Copilot’s safeguards

Tech giant targets cybercriminals who created and sold tools to bypass AI security measures and generate harmful content.