This piece exposes a chilling paradox at the heart of modern healthcare: an algorithm designed to save the most lives is systematically denying them to the young. Arvind Narayanan, writing for AI Snake Oil, dissects the UK's liver transplant matching system to reveal how a well-intentioned utilitarian calculation has devolved into a mechanism that effectively writes off patients under 45, regardless of how critically ill they are. For busy leaders in health policy, this is not just a technical glitch; it is a stark warning about the hidden costs of outsourcing life-and-death decisions to opaque models.
The Illusion of Optimization
The core of Narayanan's argument rests on the distinction between what an algorithm is told to predict and what we actually want it to achieve. The UK system, implemented in 2018, calculates a Transplant Benefit Score (TBS) by comparing a patient's predicted survival with a transplant against their predicted survival without one. On paper, this should favor younger patients who have more potential life years to gain. Narayanan writes, "Given this description, one would expect that the algorithm would favor younger patients, as they will potentially gain many more decades of life through a transplant compared to older patients." Yet, the reality is the exact opposite.
The failure lies in a specific design choice: the model caps survival predictions at five years. Narayanan explains that this was likely a pragmatic decision due to data availability, but the consequence is catastrophic for fairness. "Capping survival at five years in effect diminishes the benefits for younger patients as it underestimates the gain in life years by predicting lifetime gain over 5 years, as opposed to the total lifetime gain." By truncating the timeline, the algorithm ignores the decades of life a 20-year-old stands to lose, while an 80-year-old's remaining years fit neatly within the five-year window. This is a classic "target-construct mismatch," where the metric being optimized diverges from the moral goal of saving the most life-years.
"The choice of a 5-year period seems to be because of data availability... In our experience, there is almost always some difficulty that prevents accurately measuring the true construct of interest, which is why this is one of the recurring flaws we identify in the Against Predictive Optimization paper."
Critics might argue that without long-term data, the system had to rely on what was available, and that any system is better than the chaos of human bias. However, Narayanan counters that the alternative was not chaos, but a simpler, transparent formula that did not systematically penalize youth. The complexity here is not a feature; it is a shield.
The Black Box and the Human Cost
The article moves from technical failure to human tragedy, centering on the experience of Sarah Meredith, a 31-year-old patient who discovered her fate was being decided by a score she couldn't see. Narayanan highlights the toxic combination of medical paternalism and algorithmic mystique. "Meredith was repeatedly told she wouldn't understand," he notes, illustrating how the system's opacity prevents accountability. The lack of a physician override or an appeals process means that a mathematical error becomes a death sentence.
The investigation reveals a disturbing consensus among medical professionals that the system is broken. Narayanan quotes Palak Trivedi, a consultant hepatologist, who states, "If you're below 45 years, no matter how ill, it is impossible for you to score high enough to be given priority scores on the list." This is not a theoretical risk; it is a documented exclusion. The algorithm effectively treats a young person's death as less tragic than an older person's, not because of the value of the life, but because of the arbitrary five-year horizon.
The piece also uncovers a second, bizarre flaw: the algorithm initially predicted that patients with cancer would survive longer than those without, leading to cancer patients being rarely allocated livers. Narayanan describes this as "algorithmic absurdity," noting that the model learned from data where cancer patients received better care, not that cancer itself was benign. "This finding is reminiscent of a well-known failure from a few decades ago wherein a model predicted that patients with asthma were at lower risk of developing complications from pneumonia," he writes, drawing a parallel to historical failures where correlation was mistaken for causation.
"The fact that there are two different sets of models may also explain why it went undetected for so long — the problem is not obvious from the regression coefficients and can only be detected by simulating a patient population."
This section underscores a critical vulnerability in predictive AI: the inability of stakeholders to interrogate the logic. While the model uses interpretable regression, the separation of patient groups into different models created a blind spot that only a simulation could reveal. Narayanan argues that this obscurity is dangerous, noting that the failure was buried in the comment-and-response section of an academic paper, "the academic equivalent of Wikipedia's Talk pages."
The Ethical Vacuum
Finally, Narayanan addresses the philosophical underpinnings of the system. The algorithm enforces a rigid utilitarianism that ignores concepts of deservingness or the societal impact of losing a young person. "Predictive logic bakes in a utilitarian worldview — the most good for the greatest number," he observes. This framework struggles to account for the intuition that a child or young adult, whose disease was not self-inflicted, deserves a different consideration than an older adult with lifestyle-related conditions.
The article suggests that the drive for efficiency has overshadowed the need for ethical nuance. "It seems at least plausible that this complexity is justified in this context because health outcomes are much more predictable than who will commit a crime," Narayanan concedes, but the evidence suggests the complexity is actually a liability. The system saves more lives overall than the previous regional system, but it does so by sacrificing the most vulnerable cohorts. This trade-off was never explicitly debated; it was simply baked into the code.
"The objective of the matching system is to identify the recipient whose life expectancy would be increased the most through the transplant. The obvious way to do this is to predict each patient's expected survival time with and without the transplant. This is almost what the algorithm does, but not quite."
Bottom Line
Arvind Narayanan delivers a devastating critique of the UK's liver allocation algorithm, proving that predictive models can be mathematically sound yet morally bankrupt when the target variable is misaligned with human values. The strongest part of the argument is the exposure of the "five-year cap" as the root cause of age discrimination, a flaw that persisted for years due to a lack of transparency and oversight. The biggest vulnerability in the current system is its reliance on data availability rather than ethical constructs, a trap that health systems globally must avoid. As the executive branch and health agencies increasingly turn to AI for resource allocation, the lesson is clear: if you cannot explain why a young person is being denied a life-saving organ, the algorithm is broken, not the patient.