×
A Startup Spun out of Meta Has a Massive AI Model that Speaks the Language of Proteins
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

A startup spun out of Meta has unveiled a massive AI model that speaks the language of proteins, creating new fluorescent molecules in an impressive proof-of-principle demonstration.

EvolutionaryScale debuts protein language model ESM3: EvolutionaryScale, launched by former Meta scientists, announced its new protein language model ESM3 this month alongside $142 million in new funding to apply the model to drug development, sustainability, and other areas:

  • ESM3 was trained on over 2.7 billion protein sequences and structures, as well as data on protein functions, allowing it to design proteins to user specifications.
  • The model is seen as a frontier in the rapidly growing field of applying advanced machine learning to biological data.
  • A computational biologist at UW-Madison says ESM3 will be one of the key AI models in biology that everyone pays attention to.

Designing new fluorescent proteins from scratch: To showcase ESM3’s capabilities, the EvolutionaryScale team tasked it with redesigning green fluorescent protein (GFP), a biotechnology workhorse used to label and visualize proteins:

  • Starting with key amino acids found in GFP’s core, ESM3 generated 88 candidate designs, one of which glowed weakly.
  • Using that molecule as a starting point, further iterations by ESM3 produced several proteins that fluoresced as brightly as natural GFPs.
  • The top ESM3-designed protein, esmGFP, has less than 60% sequence similarity to known fluorescent proteins, equivalent to over 500 million years of natural evolution according to the researchers.

Frontier model raises some concerns: As one of the largest biological models to date, ESM3’s development had to be disclosed to the US government due to the computing power required during training:

  • EvolutionaryScale says it has been in contact with the White House OSTP about ESM3 as required under a 2023 AI executive order.
  • The full model is not public, and certain potentially concerning sequences were excluded from the open-source version’s training data.
  • Academics are excited to experiment with ESM3 but note that its full version would be prohibitively expensive to replicate independently.

Broader implications: While the fluorescent protein demonstration is impressive, EvolutionaryScale has much bigger ambitions for applying ESM3 to challenges like sustainability and drug development:

  • The company envisions using the model to design enzymes for breaking down plastics and to develop new therapeutic antibodies and proteins.
  • Other researchers are eager to see how ESM3 performs on tasks like designing gene-editing proteins compared to other models.
  • ESM3 represents the latest major advance in harnessing the power of large language models to understand and manipulate the fundamental language of biology and protein design.

However, comparisons between ESM3’s protein designs and millions of years of natural evolution may be misleading hype that overstates the model’s current capabilities. More rigorous testing across diverse applications will be needed to truly establish ESM3 as a transformative tool for synthetic biology and other fields.

Ex-Meta scientists debut gigantic AI protein design model

Recent News

AI agents and the rise of Hybrid Organizations

Meta makes its improved AI image generator free to use while adding visible watermarks and daily limits to prevent misuse.

Adobe partnership brings AI creativity tools to Box’s content management platform

Box users can now access Adobe's AI-powered editing tools directly within their secure storage environment, eliminating the need to download files or switch between platforms.

Nvidia’s new ACE platform aims to bring more AI to games, but not everyone’s sold

Gaming companies are racing to integrate AI features into mainstream titles, but high hardware requirements and artificial interactions may limit near-term adoption.