← Back to Library

Ai-designed phages

This isn't just another incremental step in artificial intelligence; it is the moment generative models moved from writing code to writing life. Niko McCarty reports on a breakthrough where AI didn't just simulate a genome but successfully designed one that boots up, infects bacteria, and outcompetes its natural ancestors. For busy leaders tracking the frontier of biotechnology, this shifts the conversation from theoretical possibility to operational reality.

The Leap from Simulation to Reality

McCarty opens by contrasting the recent hype around AI-designed proteins with the much harder challenge of designing entire genomes. While proteins are "self-contained entities," he notes, "even the simplest genomes are composed of multiple genes and regulatory elements that must collaborate to build a functioning, living organism." The stakes here are high because a single mutation can render a designed organism defunct. Yet, the Arc Institute and Stanford researchers have crossed this chasm. They used fine-tuned versions of the Evo 1 and Evo 2 models to create 16 bacteriophages—viruses that infect bacteria—based on the well-studied ΦX174.

Ai-designed phages

The results were not merely functional; they were superior. McCarty writes, "Some of these AI-generated phages work just as well or better at infecting E. coli cells compared to wild ΦX174." This is the critical evidence: the AI didn't just mimic nature; it optimized it. The authors describe this as "a blueprint for the design of diverse synthetic bacteriophages," laying a "foundation for the generative design of useful living systems at the genome scale." This framing is powerful because it suggests we are moving past the era of tweaking existing biological parts toward the era of inventing entirely new ones.

"The paper offers a blueprint for the design of diverse synthetic bacteriophages, and, more broadly, lays a foundation for the generative design of useful living systems at the genome scale."

Designing the Impossible

The choice of the ΦX174 virus as a testbed was strategic. It is a "proving ground" with a tiny, overlapping genome that has been sequenced and synthesized before, making it the perfect control for a radical experiment. McCarty explains that while previous synthetic biology efforts involved "decompressing" and extending natural genomes, this new work allowed researchers to go "off script." The AI generated sequences so distinct from any known bacteriophage that, in evolutionary terms, they would be classified as new species.

The methodology reveals the AI's unique strength. The models were prompted with a "fixed" portion of the genome and asked to "fill in" the rest. Initially, the base models struggled, with only about a third of their outputs resembling viable viral DNA. However, after fine-tuning on a dataset of nearly 15,000 related genomes and applying computational filters, the success rate climbed. Of 302 candidates built in the lab, 16 "booted up" to form living, infectious phages. One variant, Evo-Φ36, swapped a critical gene with one from a distant virus—a move that should have crippled the organism. Instead, McCarty notes, "the AI model rewired the rest of the Evo-Φ36 genome to make this swap work."

This highlights a fundamental shift in design logic. Human engineers often struggle with context-dependent compatibility, but the AI, trained on millions of genomes, intuitively understood how to rewire the system. As McCarty puts it, "This kind of context-dependent compatibility is hard for human designers to anticipate, but it emerges naturally from AI models that can integrate patterns from thousands of related genomes." The evidence suggests that for complex, non-modular systems like living cells, brute-force human design is hitting a wall, while data-driven AI is finding paths we cannot see.

Critics might note that focusing on a virus that infects non-pathogenic bacteria minimizes the immediate risks. However, the speed at which the AI generated viable, novel species suggests that the barrier to creating more dangerous pathogens is lower than previously assumed, even if the current application is benign.

The Biosecurity Paradox and Practical Limits

The article does not shy away from the dual-use dilemma. McCarty points out that while the Evo 2 model excludes human viruses from its training data, the code and parameters are fully open. He warns that "a sufficiently motivated person could, in principle, sculpt these models to design human viruses." The HIV genome is only about 10,000 bases, and the coronavirus is 30,000—sizes that are becoming tractable for these models. Yet, McCarty offers a sobering counterbalance: the real barriers are "data and atoms." Synthesizing a bacterial genome costs millions of dollars and takes years, a hurdle that currently keeps the technology in the realm of elite research labs rather than rogue actors.

Furthermore, the practical utility of whole-genome design is still being debated. McCarty asks whether "wholesale, bottom-up design of an entire phage genome may be more of a technical milestone than a practical one." In many therapeutic applications, tweaking a single gene or regulatory region is more efficient than redesigning the entire organism. The current success is a proof of concept, but the next challenge is to build a "training feedback loop" that can coax these models to create phages with specific, pre-specified behaviors rather than just viable ones.

"Cells are not modular gadgets, but instead have layer upon layer of feedback loops and emergent properties."

Bottom Line

McCarty's coverage effectively demonstrates that AI has graduated from designing parts to designing systems, a shift that fundamentally alters the trajectory of synthetic biology. The strongest part of the argument is the empirical proof that AI can navigate the complex, overlapping constraints of a living genome better than human rational design. The biggest vulnerability remains the open-source nature of the models, which lowers the barrier to entry for bad actors, even if the physical cost of synthesis remains high. The next critical watchpoint is whether these models can be directed to create novel functions, not just novel life forms.

Sources

Ai-designed phages

by Niko McCarty · · Read full article

A few months ago, Arc Institute released a new language model, called Evo 2, that can design entire genomes. In that original paper, though, the model’s designs — for a yeast chromosome and some small bacterial genomes — were entirely confined to a computer. The AI-generated genomes were not assembled or tested in the laboratory.

Although AI models are exceptionally good at designing proteins (including, recently, highly dynamic enzymes), there was little evidence that AI models could design viable genomes. Proteins are self-contained entities, made from a single strand of amino acids. But even the simplest genomes are composed of multiple genes and regulatory elements that must collaborate to build a functioning, living organism. A single mutation in a genome is often enough to render it entirely defunct.

But today, Arc Institute and Stanford University researchers have validated their designs in the real world, reporting the first viable genomes created using generative AI. They used fine-tuned versions of both Evo 1 and Evo 2 to create 16 bacteriophages modeled on ΦX174, a virus that infects E. coli bacteria.1 Some of these AI-generated phages work just as well or better at infecting E. coli cells compared to wild ΦX174. All of the fine-tuned models used in this work are also freely available on HuggingFace. The paper offers “a blueprint for the design of diverse synthetic bacteriophages,” the authors write, “and, more broadly, lays a foundation for the generative design of useful living systems at the genome scale.”

Choosing the Phage.

Of the 13,000 known bacteriophage types, ΦX174 is the most widely studied. First discovered in the Paris sewers in 1935, its genome includes only 5,000 bases of single-stranded DNA, with eleven genes and at least seven regulatory elements, or short stretches of DNA that regulate which genes switch on at which times. So many genes fit in such a small sequence because they physically overlap one another, with some genes tucked in the middle of other genes.

ΦX174 is often used as a model organism in molecular biology because it is easy to work with. It infects a nonpathogenic strain of E. coli, which itself divides quickly and can be readily grown in the laboratory using nutrient-laden broth and a warm incubator. These phages are also structurally simple, even by bacteriophage standards — they are made from little more than a capsid, packed with the small genome, and some proteins.

It ...