← Back to Library

BioByte 146: How pleiades predicts alzheimer's, new eden models enable drug design via evolutionary…

This week's scientific landscape is defined by a singular, unsettling realization: the most powerful tools for understanding and treating neurodegeneration are not new drugs, but the ability to read the hidden language of our own biology. Gia-Bao Dam's coverage cuts through the noise of standard clinical trials to highlight a paradigm shift where artificial intelligence is no longer just analyzing data, but actively generating testable biological hypotheses from the chaos of genetic sequences.

The Black Box Cracked Open

The most striking development discussed is the application of mechanistic interpretability to Alzheimer's prediction. Gia-Bao Dam writes, "Rather than relying on post-hoc feature attribution on raw inputs, they decomposed the internal embedding space using linear probes and sparse autoencoders to recover what qualities the embeddings contained." This is a crucial distinction. For years, foundation models have been treated as opaque oracles; we fed them data, they gave us answers, and we had to trust the math. Dam highlights how researchers at Goodfire AI reversed this, peering inside the "black box" of the Pleiades model to find that the strongest predictor of Alzheimer's was surprisingly simple: the length of DNA fragments in the blood.

BioByte 146: How pleiades predicts alzheimer's, new eden models enable drug design via evolutionary…

The implications here are profound. As Dam notes, "With a simple classifier that takes in fragment length features alone, they achieved an AUROC of approximately 0.78 on an independent cohort." This suggests that the complex, multi-billion parameter model was essentially latching onto a fundamental biological signal that had been overlooked. It bridges the gap between high-performance prediction and mechanistic understanding. Critics might argue that fragment length is a non-specific marker of cell death and could be confounded by other inflammatory conditions, but the fact that combining it with methylation signals pushed performance to 0.84 AUROC validates the approach. The real story isn't just that the model works; it's that the model told us why it works, turning a statistical correlation into a biological hypothesis.

This work demonstrates how mechanistic interpretability can turn opaque foundation-model representations into concrete, testable biological hypotheses, helping bridge the gap between high-performing black-box predictors and mechanistic understanding of the disease signals they exploit.

Turning Cancer into a Cure?

Perhaps the most counterintuitive finding covered is the link between peripheral cancer and the clearance of Alzheimer's plaques. Gia-Bao Dam explains that while the inverse correlation between cancer and Alzheimer's has long been noted, "this work provides a concrete mechanistic explanation linking peripheral cancer to active amyloid clearance in the brain." The mechanism is startling: tumors secrete a protein called cystatin-C, which crosses the blood-brain barrier and activates the brain's immune cells, specifically microglia, to eat the amyloid plaques.

Dam points out that "exogenous administration of recombinant Cyst-C is sufficient to drive plaque reduction even in aged mice with established pathology." This reframes the therapeutic landscape entirely. Instead of trying to stop the production of amyloid—a strategy that has yielded diminishing returns—the focus shifts to enhancing the brain's innate ability to clean up the mess. The dependency on the TREM2 receptor is critical here. As Dam writes, "plaque clearance is lost in TREM2 knockout mice, in mice carrying the AD-associated TREM2 R47H mutation." This connects directly to the historical context of TREM2 research, where the R47H mutation has long been known to increase Alzheimer's risk, but this study finally explains the functional consequence: a failure to activate the cleanup crew. The potential for a drug that mimics this tumor-secreted protein is immense, though the oncological side of the story remains a complex hurdle to navigate.

Designing Life from Scratch

The final pillar of Dam's analysis concerns the EDEN family of models, which aims to solve the "profound disparity between information content within biological systems and the processing bandwidth of…engineering tools." By training on a dataset of over one million species and nearly 10 billion novel genes, these models are moving beyond prediction into generation. Dam highlights the success in designing antimicrobial peptides, noting that "97% of tested AMPs showed some activity, with the top candidates having low micromolar potency against multidrug-resistant pathogens."

This is not just about finding new drugs; it is about designing entirely new biological ecosystems. Dam describes how the team generated a "fully synthetic microbiome" where "99% of the identified taxonomic units being consistent with known other species found in the human gut." This capability mirrors the statistical design approaches seen in synthetic ecology, where researchers like Oliveira et al. use high-throughput screening to distill complex community interactions into generative rules. The ability to design a synthetic microbiome that suppresses pathogens like Klebsiella pneumoniae without relying on detailed mechanistic priors suggests a future where we can engineer our internal environments with the same precision we currently apply to software. However, the leap from a petri dish to a human gut is vast, and the risk of unintended ecological consequences in a living host remains a significant counterpoint to this optimism.

The inability of current models to reliably execute complex tasks necessary for multi-modality therapeutic design may arise from a fundamental deficit in the scale and diversity of the available training data.

Bottom Line

The strongest argument in this coverage is the shift from passive observation to active, interpretable design; we are no longer just watching biology, we are reverse-engineering its logic to solve problems that have stalled for decades. The biggest vulnerability remains the translational gap between these elegant mouse models and the messy reality of human physiology, where immune systems and microbiomes are far less cooperative than in a controlled lab. Readers should watch for the first human trials of TREM2-activating therapies, as the success of the cystatin-C mechanism will likely determine the next decade of Alzheimer's research funding.

Sources

BioByte 146: How pleiades predicts alzheimer's, new eden models enable drug design via evolutionary…

by Gia-Bao Dam · · Read full article

Welcome to Decoding Bio’s BioByte: each week our writing collective highlight notable news—from the latest scientific papers to the latest funding rounds—and everything in between. All in one place.

What we read.

Blogs.

Using Interpretability to Identify a Novel Class of Alzheimer’s Biomarkers [Wang et al., Goodfire, January 2026]

Why it matters: Researchers from Goodfire AI utilize mechanistic interpretability methods to uncover the features that Pleiades, an epigenetics foundation model, uses to successfully predict Alzheimer’s Disease (AD) from cell-free DNA (cfDNA) samples.

Pleiades is an epigenomic foundation model developed by Prima Mente that captures methylated and unmethylated human DNA from full genomes and cfDNA. They utilize a hierarchical set-attention architecture that enables them to expand attention across a large swath of DNA, such as for pools of cfDNA. In the original preprint, one powerful application Pleiades was pointed towards was the prediction of whether or not a patient had AD based on their cfDNA sequences, which it did extremely well in with an AUROC of 0.82.

The team at Goodfire applied mechanistic interpretability techniques to the frozen, cfDNA-fine-tuned Pleiades model, probing its final sample-level representations to identify what features enabled it to effectively predict AD status. Rather than relying on post-hoc feature attribution on raw inputs, they decomposed the internal embedding space using linear probes and sparse autoencoders to recover what qualities the embeddings contained.

Surprisingly, the strongest features utilized by the model were all correlated to the length of the input cfDNA sequences. With a simple classifier that takes in fragment length features alone, they achieved an AUROC of approximately 0.78 on an independent cohort. By combining fragment length with other biologically meaningful signals identified through their probing, such as methylation- and cell-type–related features, they achieved performance around 0.84 AUROC, approaching the full Pleiades cfDNA-only performance of ~0.82–0.83. More broadly, this work demonstrates how mechanistic interpretability can turn opaque foundation-model representations into concrete, testable biological hypotheses, helping bridge the gap between high-performing black-box predictors and mechanistic understanding of the disease signals they exploit.

Papers.

Peripheral cancer attenuates amyloid pathology in Alzheimer’s disease via cystatin-c activation of TREM2 [Li et al., Cell, January 2026]

Why it matters: Alzheimer’s disease therapies have largely focused on limiting the formation of amyloid plaques. However, once plaques are established, the brain has limited capacity to remove them, and most interventions show diminishing returns in symptomatic disease. At the same time, epidemiological studies have long ...