BioByte 143: Predicting structure of large protein complexes, restoring naive t-cell production,…

This week's scientific roundup doesn't just report on incremental gains; it signals a fundamental shift in how we model the machinery of life and age. Varun Agarwal presents a narrative where computational speed and biological reprogramming converge to solve problems that were previously considered computationally intractable or biologically irreversible. The most striking claim isn't just that we can predict protein shapes faster, but that we can now simulate the very architecture of viral defense and immune rejuvenation with experimental-grade accuracy in seconds.

The Geometry of Speed

Agarwal opens with a breakthrough in structural biology that challenges the dominance of current AI models. While AlphaFold revolutionized the field, Agarwal notes its critical bottleneck: "the transformers underpinning AlphaFold2/3 require memory that quadratically scales with sequence length (O(L 2)), meaning that large proteins and protein complexes do not fit within a single AlphaFold prediction window." This limitation has left a vast class of symmetric protein complexes—essential for viral defense and cellular communication—largely unmodelable at scale.

BioByte 143: Predicting structure of large protein complexes, restoring naive t-cell production,…

Enter EndoFold's Cosmohedra. Agarwal explains that the team leveraged a fundamental biological shortcut: symmetry. "Many large complexes are just symmetric assemblies of smaller monomeric proteins!" he writes. By focusing on the geometry of these assemblies rather than brute-forcing the entire sequence, they achieved a runtime dependence of "O( n log n ), which is a huge upgrade over O(n 2) for large proteins." The result is staggering: a model that is "10 3 -10 5 x faster - predicting large complexes in tens of seconds."

This isn't just a speed run; it's a paradigm shift in accuracy. Agarwal points out that the predicted structures are so precise that "it's actually difficult to tell whether the Cosmohedra structures or the original cryoEM structures are more accurate to the true underlying protein structure!" This precision opens doors to drug discovery that were previously locked. For instance, the team is already "testing this paradigm in a campaign to inhibit Bufavirus 1 capsid formation," using a predicted binder designed against a single monomer.

By turning high-quality monomer predictions into near-experimental complex structures at scale, Cosmohedra effectively acts as a synthetic data generator for large protein assemblies.

Critics might argue that relying on monomer accuracy creates a fragile foundation; as Agarwal admits, if a monomer is misfolded, "no amount of downstream assembly can fully recover the true complex." Yet, the speed of iteration allows for a level of testing that was impossible before, turning a limitation into a workflow advantage.

Rewiring the Aging Immune System

The second major thread Agarwal explores is the potential to reverse the biological clock of the immune system. The focus here is the thymus, the organ responsible for T cell production, which famously shrinks with age—a process known as involution. "As humans and most other mammals age, the thymus shrinks," Agarwal writes, leading to a "decreased naive T cell output, less TCR repertoire diversity, and generally weakened primary responses."

Traditional attempts to fix this have been clumsy, often relying on hormones or small molecules with "limited effect size, toxic side effects, and generally suboptimal clinical feasibility." Agarwal highlights a new approach from Friedrich et al. that bypasses the damaged organ entirely. Instead of trying to repair the thymus, the researchers chose to "[reconstitute] the identified signalling pathways ectopically in the liver."

The mechanism is elegant. The team identified three key signaling pathways—Notch (DLL1), FLT3 ligand (FLT3L), and Interleukin-7 (IL-7)—that decline with age. They packaged mRNA for these proteins into lipid nanoparticles (LNPs) and delivered them to the liver. The liver, unlike the thymus, retains its protein-synthesis capabilities throughout life. The results were profound: the therapy "reverses the characteristic 'memory bias' of aging by restoring naive T cell counts and de novo thymopoiesis."

The implications for public health are immediate. In vaccination trials, the treatment "doubled the frequency of antigen-specific CD8+ T cells," effectively making the immune response of aged mice indistinguishable from young adults. Even more compelling is the cancer data: when combined with checkpoint blockade, the therapy drove "complete tumor rejection in 40% of cases" in aggressive melanoma models.

The liver-based mRNA delivery ensured that these trophic signals were produced transiently, avoiding the systemic toxicity and inflammation often associated with recombinant protein therapies.

One might wonder if this approach is too specific to mice to translate to humans, or if the transient nature of the treatment requires repeated dosing that could trigger immune reactions. However, the use of LNPs, a delivery system now familiar from the pandemic era, suggests a clear path to clinical translation.

Decoding Neural Circuits at Scale

The final piece of Agarwal's puzzle moves from the molecular to the macroscopic: the brain. For years, neuroscience has struggled to bridge the gap between single-cell activity and whole-brain dynamics. Agarwal notes that while all-optical neuroscience has matured, "key questions about how distributed neural ensembles shape downstream dynamics requires simultaneous, cellular-resolution control and dense population readout across millimeter-scale tissue."

The solution presented by Drinnenberg et al. is a hybrid experimental design that allows for the stimulation of "~1,000 targeted neurons while reading out activity from ~10,000 neurons." This leap in scale revealed a hidden layer of brain organization. The researchers discovered a specific subpopulation of cells, dubbed "GER (general ensemble response) cells," which act as a broad feedback mechanism.

These cells, predominantly somatostatin-expressing interneurons, "integrate over long spatial ranges" and normalize distributed excitatory activity. "Their recruitment is inconsistent with naive spatial random-connectivity models," Agarwal writes, suggesting a highly organized, previously invisible network that monitors and balances brain activity. This discovery was only possible because the new tools allowed for "systematic, causal perturbation studies that can reveal circuit motifs only visible at scale."

This leap turns all-optical experiments from constrained, small-N manipulations into systematic, causal perturbation studies that can reveal circuit motifs only visible at scale.

A counterpoint to this excitement is the complexity of interpreting such massive datasets. With 10,000 neurons firing simultaneously, distinguishing signal from noise becomes a monumental computational challenge. Yet, the identification of a specific cell type defined by its functional role rather than just its molecular markers suggests a new era of functional neuroanatomy.

Bottom Line

Varun Agarwal's curation highlights a rare moment where computational efficiency and biological insight are unlocking doors that have been shut for decades. The strongest argument here is that the bottleneck in science is no longer just data collection, but the ability to synthesize that data into predictive models that work at scale. The biggest vulnerability remains the dependency on the accuracy of the underlying components—whether a misfolded protein monomer or a flawed neural model—but the speed of iteration offered by these new tools makes correcting those errors faster than ever before. The reader should watch how quickly these theoretical models move from the lab bench to clinical trials, particularly in the realm of immune rejuvenation.

BioByte 143: Predicting structure of large protein complexes, restoring naive t-cell production,…

by Varun Agarwal · · Read full article

Welcome to Decoding Bio’s BioByte: each week our writing collective highlight notable news—from the latest scientific papers to the latest funding rounds—and everything in between. All in one place.

A Note To Our Readers: This is our last post for 2025 as we will be taking a short break from posting for the holidays. We’ll be back to our regular posting the first full week of January. Thank you from all of us for your support this year and stay tuned for more coming soon, Decoders! Happy holidays!

What we read.

Papers.

Scalable prediction of symmetric protein complex structures [Yu et al., bioRxiv, November 2025]

Why it matters: Researchers from the startup EndoFold built Cosmohedra, a physics-based model that assembles predicted structures of large, symmetric proteins at a fraction of the time previously required. This model unlocks biological inquiry and drug discovery into this class of previously expensive-to-model proteins - including building new drugs that disrupt viral complexes.

AlphaFold has been a breakthrough in protein structure prediction and downstream drug discovery, but it has its constraints. Chief among these is the large memory requirement: the transformers underpinning AlphaFold2/3 require memory that quadratically scales with sequence length (O(L²)), meaning that large proteins and protein complexes do not fit within a single AlphaFold prediction window without significant loss of accuracy or feasibility.

To tackle the problem of structure prediction for large proteins and complexes, the team at EndoFold take advantage of a convenient pattern in biology: protein symmetry. Many large complexes are just symmetric assemblies of smaller monomeric proteins! Take, for example, the Drosophila dArc2 capsid, a retrovirus-like capsid protein co-opted for neuronal cell-to-cell mRNA-based communication. The whole virus-like protein is an assembly of 240 identical Arc2 proteins, each 193 residues large. (Sidenote: Why does nature do this? One explanation is that it is genomically cheaper to encode a single monomer of 579 nucleotides, rather than explicitly encoding a full complex totaling 138,960 nucleotides.)

While AlphaFold can’t model the full complex, it generally can model the monomers of these large, symmetric proteins well! EndoFold builds on this foundation by constructing a physics-based assembly model Cosmohedra that uses true and predicted monomer structures and assembles them into symmetric complexes based on their symmetry class. To build the structure of the dArc2 capsid, for example, they assemble the predicted monomer of Arc2 with an icosahedral symmetry. To do this, they built their ...

BioByte 143: Predicting structure of large protein complexes, restoring naive t-cell production,…

The Geometry of Speed

Rewiring the Aging Immune System

Decoding Neural Circuits at Scale

Bottom Line

Deep Dives

Sources

BioByte 143: Predicting structure of large protein complexes, restoring naive t-cell production,…