×
Nous Research trains AI model with global distributed computing
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The development of distributed AI training methods marks a significant shift in how large language models can be created, potentially democratizing access to AI development beyond major tech companies and specialized data centers.

Key breakthrough: Nous Research is pre-training a 15-billion parameter large language model using machines distributed across the internet, departing from traditional centralized data center approaches.

  • The training process is being livestreamed on distro.nousresearch.com, showing real-time evaluation benchmarks and hardware locations across the U.S. and Europe
  • The project utilizes Nous DisTrO (Distributed Training Over-the-Internet), reducing inter-GPU communication bandwidth requirements by up to 10,000x
  • The system can operate on relatively modest internet connections of 100Mbps download and 10Mbps upload speeds

Technical innovation: Nous DisTrO’s efficiency gains represent a fundamental advancement in distributed AI training methods.

  • The technology compressed data exchange between GPUs from 74.4 gigabytes to just 86.8 megabytes in tests using Llama 2 architecture
  • DisTrO builds upon Decoupled Momentum Optimization (DeMo), an open-source algorithm designed to maintain training performance while reducing inter-GPU communication
  • The pre-training process involves hardware contributions from partners including Oracle, Lambda Labs, Northern Data Group, Crusoe Cloud, and the Andromeda Cluster

Industry significance: This development could fundamentally alter the landscape of AI model development.

  • The technology enables training of frontier-class LLMs without requiring expensive supercomputer clusters or low latency transmission
  • Smaller institutions and independent researchers with consumer-grade internet access could potentially train large models
  • Notable AI researcher Diederik P. Kingma, co-inventor of the Adam optimizer, has joined as a collaborator on the project

Current status and implementation: The pre-training process has demonstrated promising initial results.

  • As of publication, the training run was over 75% complete with approximately 57 hours remaining
  • The project follows Nous Research’s earlier release of Hermes 3, a Meta Llama 3.1 variant
  • While currently using high-end Nvidia H100 GPUs, future applications could extend to less specialized hardware

Future implications: The democratization of AI training could reshape the power dynamics in artificial intelligence development.

  • The technology opens possibilities for decentralized federated learning and training of various AI models, including image generation
  • Questions remain about scalability to less specialized hardware and potential applications beyond language models
  • The success of this project could shift AI development away from corporate control toward a more distributed, collaborative ecosystem
Nous Research is training an AI model using machines distributed across the internet

Recent News

Grok stands alone as X restricts AI training on posts in new policy update

X explicitly bans third-party AI companies from using tweets for model training while still preserving access for its own Grok AI.

Coming out of the dark: Shadow AI usage surges in enterprise IT

IT leaders report 90% concern over unauthorized AI tools, with most organizations already suffering negative consequences including data leaks and financial losses.

Anthropic CEO opposes 10-year AI regulation ban in NYT op-ed

As AI capabilities rapidly accelerate, Anthropic's chief executive argues for targeted federal transparency standards rather than blocking state-level regulation for a decade.