×
Arc Institute, Stanford launch largest ever publicly available biomolecular AI model
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The field of genomic research has historically been limited by computing power and the ability to process long genetic sequences. Evo 2, a groundbreaking AI foundation model, now enables scientists to analyze genetic code across diverse species with unprecedented capability.

The breakthrough: Arc Institute and Stanford University have released Evo 2, the largest publicly available AI model for genomic data, built using NVIDIA’s DGX Cloud platform on AWS.

  • The model was trained on nearly 9 trillion nucleotides, the fundamental building blocks of DNA and RNA
  • Evo 2 is accessible through NVIDIA’s BioNeMo platform and can be deployed as a NIM microservice
  • The model can process genetic sequences up to 1 million tokens in length, providing a comprehensive view of genomic data

Technical capabilities: Evo 2 represents a significant advancement in computational biology, offering powerful tools for genetic research and biomolecular applications.

  • Scientists can use the model to predict protein form and function based on genetic sequences
  • The system can identify novel molecules for healthcare and industrial applications
  • Researchers can evaluate how gene mutations affect biological function, with 90% accuracy demonstrated in tests with the BRCA1 breast cancer gene

Infrastructure and development: The project leveraged substantial computing resources and institutional support to achieve its goals.

  • The model was trained using 2,000 NVIDIA H100 GPUs via NVIDIA DGX Cloud on AWS
  • Arc Institute, established in 2021 with $650 million in funding, provided the research environment
  • The collaboration includes partnerships with Stanford University, UC Berkeley, and UC San Francisco

Practical applications: Evo 2’s capabilities extend across multiple scientific domains with potential real-world impact.

  • Healthcare researchers can use the model to understand disease-related gene variants and design targeted treatments
  • Agricultural scientists can develop more resilient and nutrient-dense crops
  • Environmental applications include the design of biofuels and proteins that can break down pollutants like oil and plastic

Looking beyond the horizon: While Evo 2’s immediate applications are promising, its full potential remains to be discovered as researchers begin exploring its capabilities in various fields.

  • The model’s ability to process longer sequences could reveal previously unknown connections in genetic code
  • Its broad training across multiple species enables cross-domain insights
  • The open availability of the model could accelerate scientific discoveries across multiple disciplines

The true significance of Evo 2 may lie not just in its current capabilities, but in how it democratizes access to advanced genomic research tools, potentially accelerating the pace of scientific discovery in ways that are difficult to predict.

Massive Foundation Model for Biomolecular Sciences Now Available via NVIDIA BioNeMo

Recent News

North Korea unveils AI-equipped suicide drones amid deepening Russia ties

North Korea's AI-equipped suicide drones reflect growing technological cooperation with Russia, potentially destabilizing security in an already tense Korean peninsula.

Rookie mistake: Police recruit fired for using ChatGPT on academy essay finds second chance

A promising police career was derailed then revived after an officer's use of AI revealed gaps in how law enforcement is adapting to new technology.

Auburn University launches AI-focused cybersecurity center to counter emerging threats

Auburn's new center brings together experts from multiple disciplines to develop defensive strategies against the rising tide of AI-powered cyber threats affecting 78 percent of security officers surveyed.