AI existential risk probabilities are too unreliable to inform policy

Arvind Narayanan delivers a stunningly clear rebuke to the most common justification for sweeping AI regulation: the idea that we can reliably calculate the odds of human extinction. In an era where policymakers are being urged to act on probability estimates ranging from 0.25% to 12%, Narayanan argues that these numbers are not just uncertain—they are fundamentally illegitimate as a basis for democratic governance. For busy leaders trying to separate signal from noise, this essay cuts through the fog of speculation to ask a question rarely heard in the halls of power: on what empirical ground do we actually stand?

The Illusion of Precision

Narayanan begins by dismantling the authority of the numbers themselves. He writes, "Probabilities are usually derived from some grounded method, so we have a strong cognitive bias to view quantified risk estimates as more valid than qualitative ones." This observation is crucial because it exposes a psychological trap: the mere presence of a percentage sign tricks the brain into assuming rigor where none exists. The author points out that while a forecaster can casually guess odds for a horse race, a government cannot limit civil liberties based on a guess.

AI existential risk probabilities are too unreliable to inform policy

The core of the argument rests on the principle of democratic legitimacy. Narayanan asserts, "A core principle of liberal democracy is that the state should not limit people's freedom based on controversial beliefs that reasonable people can reject." When the executive branch considers restricting the open release of AI models—a move that would stifle innovation and concentrate power in the hands of a few large corporations—it must be able to justify that sacrifice to the public. As Narayanan puts it, "Justification is essential to legitimacy of government and the exercise of power."

Critics might argue that waiting for perfect evidence before acting on existential threats is a form of paralysis, akin to ignoring a fire because we can't calculate the exact probability of it spreading. However, Narayanan's point is not that we should ignore the risk, but that we cannot use unreliable numbers to justify costly, freedom-restricting policies. The distinction is vital for any administrator weighing the trade-offs between safety and openness.

Justification is essential to legitimacy of government and the exercise of power.

The Failure of Inductive Reasoning

The essay then systematically dismantles the three ways forecasters claim to derive these probabilities, starting with induction. Inductive reasoning relies on a "reference class"—a group of similar past events used to predict the future. Narayanan explains that while insurers can predict car accidents by looking at past data for similar drivers, "For existential risk from AI, there is no reference class, as it is an event like no other."

He acknowledges that some experts try to force a fit, comparing AI extinction to animal extinction or the Industrial Revolution. But Narayanan dismisses these comparisons as futile, noting, "None of those tell us anything about the possibility of developing superintelligent AI or losing control over such AI." This is where the historical context of Black Swan theory becomes relevant. As Nassim Taleb's work on rare, high-impact events suggests, the very nature of a Black Swan is that it lies outside the realm of normal expectations. Narayanan effectively argues that trying to model AI risk using inductive methods is like trying to predict the collapse of the USSR using data from 19th-century monarchies; the similarity is too superficial to yield a useful number.

The author's critique of the "reference class" problem is particularly sharp because it highlights the arbitrariness of the current debate. If the choice of reference class comes down to an analyst's intuition, then the resulting probability is just a number dressed up in math. Narayanan writes, "The accuracy of the forecasts depends on the degree of similarity between the process that generates the event being forecast and the process that generated the events in the reference class." Since no such similarity exists for AI extinction, the math collapses.

The Limits of Deduction and the Trap of Subjectivity

Moving to deductive reasoning, Narayanan contrasts AI risk with asteroid impacts. We can calculate the probability of an extinction-level asteroid because we have a physical model of the solar system and a clear relationship between size and impact frequency. "With AI, the unknowns relate to technological progress and governance rather than a physical system, so it isn't clear how to model it mathematically," he notes.

When neither induction nor deduction works, forecasters resort to subjective probability—essentially, educated guesses. Narayanan exposes the chaos this creates by citing the Existential Risk Persuasion Tournament. The results were staggering: AI experts estimated a median extinction risk of 3%, while superforecasters estimated 0.38%. "The 75th percentile AI expert forecast and the 25th percentile superforecaster forecast differ by at least a factor of 100," Narayanan writes. This massive divergence proves that these are not scientific measurements but rather reflections of individual bias.

He argues that without a model, these forecasts are just "feelings dressed up as numbers." Even the most skilled forecasters, trained to minimize bias, cannot conjure accuracy out of thin air. Narayanan quotes philosopher Nick Bostrom to reinforce the point: "The uncertainty and error-proneness of our first-order assessments of risk is itself something we must factor into our all-things-considered probability assignments." This creates a paradox where acknowledging uncertainty makes the probability estimate itself a guess, rendering it useless for policy.

When forecasters aren't drawing on any special knowledge, evidence, or models, their hunches are no more credible than anyone else's.

Bottom Line

Narayanan's most powerful contribution is his insistence that speculation cannot drive policy. The strongest part of this argument is its grounding in democratic theory: you cannot restrict rights based on a number that varies by a factor of 100 depending on who you ask. The biggest vulnerability, however, is the potential for this logic to be weaponized by those who wish to ignore the risk entirely; while the numbers may be unreliable, the underlying fear of loss of control is not necessarily baseless. The reader should watch for how the administration and regulatory bodies respond to this challenge: will they continue to cite shaky probabilities, or will they pivot to a policy framework based on concrete safety standards and governance mechanisms rather than speculative odds? The future of AI regulation depends on answering that question honestly.

AI existential risk probabilities are too unreliable to inform policy

by Arvind Narayanan · AI Snake Oil · Read full article

How seriously should governments take the threat of existential risk from AI, given the lack of consensus among researchers? On the one hand, existential risks (x-risks) are necessarily somewhat speculative: by the time there is concrete evidence, it may be too late. On the other hand, governments must prioritize — after all, they don’t worry too much about x-risk from alien invasions.

This is the first in a series of essays laying out an evidence-based approach for policymakers concerned about AI x-risk, an approach that stays grounded in reality while acknowledging that there are “unknown unknowns”.

In this first essay, we look at one type of evidence: probability estimates. The AI safety community relies heavily on forecasting the probability of human extinction due to AI (in a given timeframe) in order to inform decision making and policy. An estimate of 10% over a few decades, for example, would obviously be high enough for the issue to be a top priority for society.

Our central claim is that AI x-risk forecasts are far too unreliable to be useful for policy, and in fact highly misleading.

Look behind the curtain.

If the two of us predicted an 80% probability of aliens landing on earth in the next ten years, would you take this possibility seriously? Of course not. You would ask to see our evidence. As obvious as this may seem, it seems to have been forgotten in the AI x-risk debate that probabilities carry no authority by themselves. Probabilities are usually derived from some grounded method, so we have a strong cognitive bias to view quantified risk estimates as more valid than qualitative ones. But it is possible for probabilities to be nothing more than guesses. Keep this in mind throughout this essay (and more broadly in the AI x-risk debate).

If we predicted odds for the Kentucky Derby, we don’t have to give you a reason — you can take it or leave it. But if a policymaker takes actions based on probabilities put forth by a forecaster, they had better be able to explain those probabilities to the public (and that explanation must in turn come from the forecaster). Justification is essential to legitimacy of government and the exercise of power. A core principle of liberal democracy is that the state should not limit people's freedom based on controversial beliefs that reasonable people can reject.

Explanation is especially important ...

AI existential risk probabilities are too unreliable to inform policy

The Illusion of Precision

The Failure of Inductive Reasoning

The Limits of Deduction and the Trap of Subjectivity

Bottom Line

Deep Dives

Sources

AI existential risk probabilities are too unreliable to inform policy