A Job Description, Not a Species Classification
Scott Alexander's latest essay tackles one of the most persistent dismissals in artificial intelligence discourse: the claim that large language models are "just next-token predictors" or "stochastic parrots." His central argument is that this label confuses what an AI was trained to do with what an AI actually is -- a confusion of optimization levels that, when applied consistently, would reduce human beings to "just survival-and-reproduction machines."
The piece opens by acknowledging Kelsey Piper's work in The Argument, which catalogued the ways modern AI systems go beyond raw next-token prediction through fine-tuning and reinforcement learning from human feedback. But Alexander is less interested in the technical rebuttals than in a deeper philosophical point about how optimization processes relate to the systems they produce.
The Nested Optimization Argument
Alexander builds his case through a layered analogy between biological and artificial intelligence. At the outermost level, evolution optimized humans for survival and reproduction. One level down, the brain's learning algorithm -- predictive coding -- optimizes for predicting the next sensory input. He quotes Wikipedia's description of predictive coding as a theory where the brain "is constantly generating and updating a 'mental model' of the environment" that "is used to predict input signals from the senses."
The parallel to AI training is deliberate. Just as evolution found it "inefficient to hand-code everything" and instead gave humans learning algorithms, AI companies gave their systems deep learning rather than giant lookup tables. The most powerful of those learning algorithms, in both cases, turns out to be prediction.
I think overemphasizing next-token prediction is a confusion of levels. On the levels where AI is a next-token predictor, you are also a next-token (technically: next-sense-datum) predictor. On the levels where you're not a next-token predictor, AI isn't one either.
This is the crux of the essay, stated plainly. If the label "next-token predictor" is supposed to be dismissive -- supposed to imply that real understanding cannot emerge from prediction -- then it must equally dismiss human cognition, which runs on the same fundamental principle.
What the Insides Actually Look Like
Alexander is at his most effective when he walks through what prediction actually produces inside a system. A person doing math does not experience their cognition as predicting the next sense-datum. They experience it as carrying the one, reciting PEMDAS, doing "good, normal math." The training process shaped the machinery, but the machinery itself operates on its own terms.
But this doesn't mean the AI's innards look like "Hmmmm, what will the next token be?" The AI certainly isn't answering your math question by thinking something like "Hmmmm, she used the number three, which has the tokens th and ree, and I know that there's a 8.2% chance that ree is often seen somewhere around the token ix, so the answer must be six!"
He then cites recent mechanistic interpretability research from Anthropic, where researchers discovered that Claude represents features of line-breaking as "one-dimensional helical manifolds in a six-dimensional space." The point is not the specifics of the geometry. The point is that the internal structures bear almost no resemblance to the training objective that created them.
Next-token prediction created this system, but the system itself can involve arbitrary choices about how to represent and manipulate data.
The human equivalent, Alexander notes, is equally alien. The entorhinal cells in the hippocampus use "high-dimensional toroidal attractor manifolds" to track spatial location. Nobody experiences navigation as toroidal attractors. These are, as he puts it, "strange hacks that next-token/next-sense-datum prediction algorithms discover to encode complicated concepts onto physical computational substrate."
The Celibate Monk and the Optimization Mismatch
One of the essay's sharpest illustrations is borrowed from the rationalist tradition. Alexander invokes the concept of "adaptation-executors, not fitness maximizers" to show how thoroughly an optimized system can diverge from its optimization target.
When a monk decides to swear an oath of celibacy and never reproduce, he does so using a brain that was optimized to promote reproduction -- just using it very far out of distribution, in an area where it no longer functions as intended.
The implication for AI is clear. An AI system trained on next-token prediction can develop internal structures and capabilities that go well beyond -- or sideways from -- that original training signal. The training process is a job description for the optimization loop, not a species classification for the resulting system.
Where the Analogy Stretches Thin
Alexander himself flags one weakness in his argument: the analogy conflates two different optimization levels. Evolution optimizing for reproduction is the outer loop; predictive coding is the inner loop. In AI, company incentives form the outer loop and next-token prediction forms the inner loop. He concedes this mismatch but argues the simpler version -- evolution and reproduction -- works better as a "didactic tool" because most people have not encountered the neuroscience of predictive coding.
This simple analogy is slightly off, because it's confusing two optimization levels: the outer optimization level (in humans, evolution optimizing for reproduction; in AIs, companies optimizing for profit) with the inner optimization level (in humans, next-sense-datum prediction; in AIs, next-token prediction).
There is a counterpoint Alexander does not fully address. The "stochastic parrot" critique, at its strongest, is not really about the training objective. It is about the absence of grounding -- the fact that language models operate on tokens that refer to a world they have never directly experienced, while human predictive coding operates on sense data that carries causal information about physical reality. The gap between predicting tokens and predicting sense data may matter more than Alexander's neat symmetry suggests, even if he is right that both systems build genuine world models from their respective prediction tasks.
Additionally, the mechanistic interpretability evidence cuts both ways. The discovery of helical manifolds inside Claude is fascinating, but it demonstrates sophisticated data representation, not necessarily understanding in the way humans mean that word. Complex internal structure is a necessary condition for understanding, but it may not be a sufficient one.
The Rhetorical Move
Alexander saves his most quotable formulation for the summary.
The most compelling analogy: this is like expecting humans to be "just survival-and-reproduction machines" because survival and reproduction were the optimization criteria in our evolutionary history.
He acknowledges that in some sense humans are exactly that -- no human faculty exists that cannot be traced back to survival and reproduction. But the label is useless as a description of what humans actually do, think, or experience. The same, he argues, should apply to AI.
Nothing about any of these levels of explanations supports a contention like "Humans are doing REAL THOUGHT, but AIs are simply next-token predictors."
The essay closes with a gesture toward future work -- a promised "Anti-Stochastic-Parrot FAQ" that will address hallucinations, the differences between tokens and sense data, and other objections. Alexander is clearly aware that this piece handles only one facet of a larger debate.
Bottom Line
Alexander delivers a genuinely useful conceptual framework: the distinction between what trained a system and what the system is. His nested-optimization-loop diagram is the kind of tool that clarifies muddled debates, and his invocation of mechanistic interpretability research gives the philosophical argument empirical weight. The essay is strongest when it holds the mirror up -- when it forces anyone who dismisses AI as "just a next-token predictor" to grapple with the fact that their own brain is, at the algorithmic level, doing something strikingly similar.
The argument has real limits, though. Prediction over tokens and prediction over sense data are not identical, and Alexander's own admission that his core analogy "is slightly off" deserves more than the parenthetical treatment it receives. The question of whether world models built from text can ever match world models built from embodied experience remains genuinely open. But as a corrective to lazy dismissals, this essay does serious work.