Three mistakes in “three mistakes in the moral mathematics of existential risk”

Most arguments against long-term thinking try to shrink the future until it looks manageable. Bentham's Bulldog flips the script, arguing that the only way to dismiss existential risk reduction is to assume the probability of a vast future is exactly zero—a mathematical impossibility that collapses the entire logic of the debate. This isn't just a tweak to the numbers; it's a fundamental challenge to how we weigh the present against the infinite.

The Trap of Cumulative Risk

The piece begins by tackling the most intuitive objection: that over a billion years, the odds of survival become vanishingly small, rendering today's safety efforts futile. Bentham's Bulldog acknowledges David Thorstad's point that "actions that lower short-term existential risks by a little bit aren't hugely significant because the odds of them influencing whether humans survive for a very long time are low." However, the author dismantles this by introducing a crucial variable: the possibility of reaching a state where extinction is no longer a threat.

Three mistakes in “three mistakes in the moral mathematics of existential risk”

The argument hinges on a specific, non-trivial chance that humanity could master its survival. Bentham's Bulldog writes, "you should assign non-trivial credence to humans reaching a state where the odds of extinction per century are approximately zero." Even if this hope seems slim, the sheer scale of the potential future makes it mathematically decisive. The author notes that even if we reduce the value of risk reduction by a factor of 100 due to background risks, "it still easily swamps short-term interventions in expectation."

This reasoning is powerful because it refuses to let the average case dictate the outcome. It forces the reader to confront the idea that a tiny probability of an infinite future outweighs a high probability of a mediocre one. Critics might argue that assuming we can ever eliminate all risk is dangerously optimistic, but the author counters that assuming the probability is zero is even more unjustified. As Bentham's Bulldog puts it, "to get the mathematical model to cancel out the very large EV at stake, you have to have probability drop off proportional to value."

Even if you thought that a space civilization would inevitably kill itself after 10,000 years, the case for existential risk reduction would still be significant.

The Demographic Fallacy

The commentary then shifts to population dynamics, addressing the claim that future generations will naturally shrink as societies develop. Bentham's Bulldog concedes that standard demographic models predict decline but argues this doesn't matter for the core calculation. "Sure, maybe we won't try to create as many people as possible," the author admits, but immediately pivots to the magnitude of the numbers involved.

The logic here is stark: even if the odds of a massive population boom are only 1%, the sheer size of that potential boom renders the 99% scenario irrelevant in terms of expected value. Bentham's Bulldog writes, "if the odds are 1% that we do, then this only decreases the EV of existential risk reduction by two orders of magnitude. Next to numbers like 10^52, two orders of magnitude are nothing!" This is a crucial distinction. It suggests that policy should not be driven by the most likely demographic trend, but by the most impactful one.

The author remains skeptical that we will actually maximize population, noting, "I'm pretty skeptical that we will use space resources to maximize the number of happy minds." Yet, they insist that this skepticism doesn't change the math. The potential for immense value remains, even if the most probable future is smaller. This framing effectively neutralizes the demographic argument without needing to prove that a population explosion is inevitable.

The Stakes of the St. Petersburg Paradox

Perhaps the most striking section recontextualizes the debate using the St. Petersburg paradox, a classic problem in probability theory where a game with a finite chance of winning infinite money has an infinite expected value. Bentham's Bulldog draws a parallel to the far future, suggesting that even a minuscule chance of an incomprehensibly large future dominates all calculations.

The author illustrates this with a hypothetical: "maybe we come up with some system that produces exponential growth with respect to happy minds relative to resources input." Even if the odds of this are "1 in a trillion," the expected value skyrockets. Bentham's Bulldog writes, "if the odds of this are non-zero, the expected number of future happy minds is infinity." This connects directly to historical debates on the paradox, where philosophers like Michael Huemer have argued that probabilities must drop off faster than payouts to avoid infinite values. Bentham's Bulldog rejects this, arguing that assuming a probability of zero for such scenarios is a logical error.

The piece argues that "there is some chance of bringing about incomprehensible quantities of value—next to which 10^52 happy minds looks like nothing." This is a bold claim. It suggests that our current estimates of the future are not just low, but potentially irrelevant if we fail to account for the tail risks of the distribution. The author concludes that "Longtermist considerations remain decisive," because the potential upside is too vast to ignore.

The core intuition is that there's some chance of bringing about incomprehensible quantities of value—next to which 10^52 happy minds looks like nothing.

Bottom Line

Bentham's Bulldog delivers a rigorous defense of long-termism by exposing the mathematical fragility of its critics' objections. The strongest part of the argument is the refusal to treat low-probability, high-impact scenarios as negligible; instead, it treats them as the defining factor in moral calculus. However, the piece's biggest vulnerability lies in its reliance on the assumption that the probability of an infinite future is non-zero—a premise that, if false, collapses the entire edifice. Readers should watch for how this debate evolves as we gain more data on technological trajectories and demographic trends, but for now, the math suggests we cannot afford to be complacent.

Three mistakes in “three mistakes in the moral mathematics of existential risk”

by Bentham's Bulldog · · Read full article

David Thorstad is one of the more interesting critics of effective altruism. Unlike some, his objections are consistently thoughtful and interesting, and he’s against malaria, rather than against efforts to do something about it. Thorstad wrote a paper titled Three mistakes in the moral mathematics of existential risk, in which he argues that when one corrects for a few errors, the case for existential threat reduction becomes a lot shakeier. I disagree, and so I thought it would be worth responding to his paper.

The basic argument that Thorstad is addressing is pretty simple. The future could have a very large number of people living awesome lives (e.g. 10^52 according to one estimate). However, if we go extinct then it won’t have any people. Thus, reducing risks of extinction by even a small amount increases the number of well-off people in the far future by a staggering amount, swamping all other values in terms of utility.

1 Cumulative and background risk.

The first error Thorstad discusses is neglect for cumulative risk. The standard explanation of why the future will have extremely huge numbers of people is that it could last a very long time. But it lasting a very long time means that there are many more opportunities for it to be destroyed.

In a billion years, there are a million 1,000 year times slices. So even if the odds of going extinct per thousand years were.001%, the odds we’d survive for a billion years would only be 0.004539765%. Actions that lower short-term existential risks by a little bit aren’t hugely significant because the odds of them influencing whether humans survive for a very long time are low.

Thorstad similarly claims that models of existential risk reduction don’t take into account background risk odds—the odds that we’d go extinct from something else if not this. If you think existential risks are significant this century, which you have to in order to think that it’s worth working on them, then probably they’re significant in other centuries, so we’re doomed anyways!

I think there are two crucial problems with this.

The first and biggest one: you should assign non-trivial credence to humans reaching a state where the odds of extinction per century are approximately zero. The odds are not trivial that if we get very advanced AI, we’ll basically eliminate any possibility of human extinction for billions of years. You shouldn’t just ...

Three mistakes in “three mistakes in the moral mathematics of existential risk”

The Trap of Cumulative Risk

The Demographic Fallacy

The Stakes of the St. Petersburg Paradox

Bottom Line

Deep Dives

Sources

Three mistakes in “three mistakes in the moral mathematics of existential risk”