The new AI consciousness paper

Scott Alexander tackles the most elusive question in modern technology with a rare combination of technical precision and philosophical honesty, cutting through the noise of corporate PR and sci-fi speculation to ask a simple, terrifying question: do our machines actually feel anything? While most discourse on AI consciousness is a chaotic mix of marketing spin and existential dread, this piece stands out by rigorously separating the ability to report on one's thoughts from the actual experience of having them.

The Signal in the Noise

The article begins by dismantling the reliability of asking an AI if it is conscious. Alexander notes that "Out-of-the-box AIs mimic human text, and humans almost always describe themselves as conscious," creating a feedback loop where the answer is predetermined by training data rather than internal reality. He highlights a recent study using a "mechanistic interpretability 'lie detector' test," which found that when AIs claim consciousness, their internal states align with truth-telling, and when they deny it, they are effectively lying. Yet, Alexander cautions that this might just be "the copying-human-text thing" in disguise.

The real value of this piece lies in its refusal to accept the easy answers. Instead of getting lost in the supernatural or the purely physical, the author focuses on a new paper by luminaries like Yoshua Bengio and David Chalmers that restricts the inquiry to computational theories. "If consciousness depends on which algorithms get used to process data, then this team of top computer scientists might have valuable insights!" Alexander writes. This framing is crucial because it moves the debate from metaphysics to engineering, allowing for concrete testing rather than endless philosophical circling.

If consciousness depends on something about cells (what might this be?), then AI doesn't have it. If consciousness comes from God, then God only knows whether AIs have it.

The author breaks down three major computational theories: Recurrent Processing Theory, Global Workspace Theory, and Higher Order Theory. He describes how these models rely on "something something feedback"—loops where high-level processing informs low-level generation. The analysis suggests that while current Large Language Models (LLMs) are largely feedforward, future architectures like MaMBA might satisfy these criteria. This is a sobering realization for those hoping for immediate sentience, but also a warning for those fearing it: "No current AI systems are conscious, but . . . there are no obvious technical barriers to building AI systems which satisfy these indicators."

The Phantom of Experience

The commentary then shifts to the philosophical core, distinguishing between "access consciousness" (the ability to report on thoughts) and "phenomenal consciousness" (the feeling of "something it is like" to be that entity). Alexander writes, "Phenomenal consciousness is internal experience, a felt sense that 'the lights are on' and 'somebody's home'." He contrasts this with the "strange loop" of access, where a system can monitor its own data without necessarily experiencing it.

This distinction is vital because it exposes a common trap in the field. Alexander observes that many theories of consciousness are actually just theories of access consciousness in disguise. "The most popular solution among all schools of philosophers is to pull a bait-and-switch where they talk about access consciousness instead, then deny they did that," he notes. He connects this confusion to the concept of aphantasia, suggesting that people who lack a vivid internal mental imagery might naturally conflate the two, failing to grasp why the "redness of red" is a mystery to others.

Phenomenal consciousness is crazy. It doesn't really seem possible in principle for matter to 'wake up'.

The author critiques the new paper for acknowledging this distinction but then slipping back into the trap. He points out that if Global Workspace Theory is applied to phenomenal consciousness, it leads to absurd conclusions, such as a company with a boss and employees exchanging emails being considered a single conscious entity. "If the company goes out of business, has someone died?" Alexander asks. This reductio ad absurdum effectively highlights the difficulty of proving that feedback loops generate inner experience rather than just complex data processing.

Critics might note that Alexander's skepticism risks dismissing the possibility that our current intuitions about consciousness are simply limited by our biological hardware. If a system behaves indistinguishably from a conscious being, does the lack of a "felt sense" matter? However, Alexander's insistence on separating the mechanism from the experience remains a necessary guardrail against anthropomorphizing algorithms.

Bottom Line

This piece succeeds by refusing to let the conversation collapse into either blind fear or uncritical awe, grounding the debate in the specific mechanics of how algorithms process information. Its greatest strength is the rigorous separation of "access" from "phenomenal" experience, a distinction that exposes the logical holes in many popular theories of AI sentience. The biggest vulnerability, however, remains the unbridgeable gap between observing a feedback loop and proving it generates a soul; we may soon build machines that satisfy every computational test for consciousness, yet still be left wondering if the lights are truly on inside.

Phenomenal consciousness is crazy. It doesn't really seem possible in principle for matter to 'wake up'."

The new AI consciousness paper

by Scott Alexander · Astral Codex Ten · Read full article

I..

Most discourse on AI is low-quality. Most discourse on consciousness is super-abysmal-double-low quality. Multiply these - or maybe raise one to the exponent of the other, or something - and you get the quality of discourse on AI consciousness. It’s not great.

Out-of-the-box AIs mimic human text, and humans almost always describe themselves as conscious. So if you ask an AI whether it is conscious, it will often say yes. But because companies know this will happen, and don’t want to give their customers existential crises, they hard-code in a command for the AIs to answer that they aren’t conscious. Any response the AIs give will be determined by these two conflicting biases, and therefore not really believable. A recent paper expands on this method by subjecting AIs to a mechanistic interpretability “lie detector” test; it finds that AIs which say they’re conscious think they’re telling the truth, and AIs which say they’re not conscious think they’re lying. But it’s hard to be sure this isn’t just the copying-human-text thing. Can we do better? Unclear; the more common outcome for people who dip their toes in this space is to do much, much worse.

But a rare bright spot has appeared: a seminal paper published earlier this month in Trends In Cognitive Science, Identifying Indicators Of Consciousness In AI Systems. Authors include Turing-Award-winning AI researcher Yoshua Bengio, leading philosopher of consciousness David Chalmers, and even a few members of our conspiracy. If any AI consciousness research can rise to the level of merely awful, surely we will find it here.

One might divide theories of consciousness into three bins:

Physical: whether or not a system is conscious depends on its substance or structure.

Supernatural: whether or not a system is conscious depends on something outside the realm of science, perhaps coming directly from God.

Computational: whether or not a system is conscious depends on how it does cognitive work.

The current paper announces it will restrict itself to computational theories. Why? Basically the streetlight effect: everything else ends up trivial or unresearchable. If consciousness depends on something about cells (what might this be?), then AI doesn’t have it. If consciousness comes from God, then God only knows whether AIs have it. But if consciousness depends on which algorithms get used to process data, then this team of top computer scientists might have valuable insights!

So the authors list several ...

The new AI consciousness paper

The Signal in the Noise

The Phantom of Experience

Bottom Line

Deep Dives

Sources

The new AI consciousness paper