Skip to Main Content

In 2018, during her chemistry Nobel Prize lecture, Frances Arnold noted that scientists had arrived at a point where they could read, write, and edit any sequence of DNA. But composing whole genes or even whole genomes from scratch — that was something only evolution could do.

A few years later, not long after helping to launch the Arc Institute, a nonprofit research center in the Bay Area, molecular engineer Patrick Hsu wondered if it was possible to imitate the forces of evolution that Arnold had been referring to. DNA is a language, after all, and with all the advances in generative AI — chatbots that could hold eerily lifelike conversations if trained on enough text — maybe recreating all the cellular complexity contained in a genome wasn’t that far behind.

advertisement

Working with Brian Hie, a computational biologist at Stanford University and a fellow Arc Institute member, Hsu, who is also an assistant professor at the University of California, Berkeley, began assembling a team of scientists to train an AI model on vast troves of biological data — 300 billion DNA letters, including long sequences from 80,000 genomes of bacteria and archaea.

STAT+ Exclusive Story

STAT+

This article is exclusive to STAT+ subscribers

Unlock this article — plus in-depth analysis, newsletters, premium events, and news alerts.

Already have an account? Log in

Monthly

$39

Totals $468 per year

$39/month Get Started

Totals $468 per year

Starter

$30

for 3 months, then $399/year

$30 for 3 months Get Started

Then $399/year

Annual

$399

Save 15%

$399/year Get Started

Save 15%

11+ Users

Custom

Savings start at 25%!

Request A Quote Request A Quote

Savings start at 25%!

2-10 Users

$300

Annually per user

$300/year Get Started

$300 Annually per user

View All Plans

To read the rest of this story subscribe to STAT+.

Subscribe

To submit a correction request, please visit our Contact Us page.