Why i don’t think agi is right around the corner

What if the biggest problem with artificial intelligence isn't how smart it is—but how hard it is to teach? Dwarkesh Patel spent 100 hours trying to build LLM tools for his post-production workflow and found something unexpected: current AI models are magical in their capabilities, but they can't learn from feedback the way humans do. In this piece, he makes a case that's been missing from most AI coverage—explaining why today's systems hit a fundamental wall that no amount of scaling can solve.", ## The Magical Limitation

Dwarkesh Patel has spent countless hours experimenting with large language models trying to build tools for his post-production setup. He uses them to rewrite autogenerated transcripts, identify clips from transcripts, and co-write essays passage by passage.

Why i don’t think agi is right around the corner

These are simple, self-contained tasks that should be central to an LLM's repertoire—and they're impressive. But here's the fundamental problem: LLMs don't get better over time the way a human would.

This isn't a minor inconvenience. It's a massive bottleneck. The baseline capability of these models might be higher than an average human on many tasks, but there's no way to give them high-quality feedback and let them improve. You're stuck with whatever abilities come out of the box. You can keep messing around with system prompts, but in practice, this doesn't produce anything close to the kind of learning and improvement that human employees experience.

Why Humans Are More Useful Than Raw Intellect

The reason humans are so useful isn't mainly their raw intellect—it's their ability to build up context, interrogate their own failures, and pick up small improvements as they practice a task.

Patel uses a vivid analogy. How would you teach a kid to play the saxophone? You'd have them try to blow into it, see how it sounds, and adjust. Now imagine teaching saxophone differently: a student takes one attempt, and the moment they make a mistake, you send them away and write detailed instructions about what went wrong. You call in the next student who reads your notes and tries again.

This just wouldn't work. No matter how well-honed your instructions are, no kid is going to learn to play the saxophone from reading them.

But this is the only modality we have to teach LLMs anything.

Yes, there's reinforcement learning fine-tuning, but it's not a deliberate adaptive process in the way that human learning is. Patel notes that his editors got extremely good not because he built bespoke RL environments for their different subtasks—they simply noticed small things themselves and thought hard about what resonates with the audience.

What About Computer Use Agents?

When researchers from Anthropic suggested that reliable computer use agents should arrive by the end of next year—where you could tell an AI to go do your taxes, have it read through your emails, Amazon orders, Slack messages, compile receipts, decide what are business expenses, ask for approval on edge cases, and submit form 1040 to the IRS—Patel is skeptical.

One reason: as horizon lengths increase, rollouts have to become longer. The AI needs to do two hours worth of agentic computer use tasks before we can even see if it did it right. Computer use also requires processing images and video, which is already more computationally intensive.

Two, we don't have a large pre-training corpus of multimodal computer use data. There's no shortage of internet text to train language models, but getting models to become reliable, competent agents in other domains required different kinds of data that simply aren't available.

Three, even algorithmic innovations that seem simple in retrospect took a long time to iron out. The RL procedure explained in DeepSeek's R1 paper seemed simple at a high level, yet it took two years from GPT-4 to the release of O1.

His Timelines

Patel doesn't expect to see some OpenAI livestream where they announce that continual learning has been totally solved. When he interviewed Anthropic researchers, they said they expect reliable computer use agents by the end of next year—but he's not convinced that's realistic.

He offers his own timeline estimates:

For an AI that can do taxes end-to-end for a small business as well as a competent general manager could in a week—including tracing down receipts on different websites, finding missing pieces, emailing back and forth with people, filling out forms, and sending it to the IRS—he says this would come around 2028.

For when AI will be able to learn on the job as easily, organically, and seamlessly as humans do for any white-collar work—for example, if he hired an AI video editor after six months, it would have as much actionable deep understanding of his preferences, their channel, and what works for the audience as a human would—he says this would come in 2032.

The Discontinuity That Could Change Everything

While this makes Patel bearish about transformative AI in the next few years, it makes him especially bullish on AI over the next decades. When we do solve continual learning, we'll see a huge discontinuity in the value of these models.

Even if there isn't a software-only singularity where these models rapidly build smarter and smarter successor systems, we might still get something that looks like a broadly deployed intelligence explosion. AI will be getting broadly deployed through the economy doing different kinds of jobs and learning while doing them in the way that humans can.

However, unlike humans, these models can amalgamate their learnings across all their copies. So one AI is basically learning how to do every single job in the economy.

An AI capable of this kind of online learning might rapidly become a superintelligence even if there's no further algorithmic progress.

"One AI that could learn like this would basically be learning how to do every single job in the economy."

Critics might note that this view underestimates how quickly RL fine-tuning and other techniques have advanced. Some researchers argue these methods are already producing meaningful improvements in sample efficiency, possibly reducing the data hunger he worries about. Others contend that computer use demonstrations show genuine progress even if end-to-end tax completion isn't fully reliable yet.

Bottom Line

Patel's strongest argument is that continual learning—the ability to build context and improve from feedback—is a fundamental wall that current AI architectures can't easily cross. His biggest vulnerability is that he may be underestimating how quickly reinforcement learning fine-tuning and other techniques are advancing. The technology landscape shifts fast, and the bottleneck he's worried about might prove more solvable than he expects.

What readers should watch for: any breakthrough in giving AI systems persistent memory within a session, or new approaches to high-quality feedback loops that let models improve between tasks. The gap between current capabilities and human-like learning is real—but whether it takes seven years or two to solve is where the debate lives.

Why i don’t think agi is right around the corner

by Dwarkesh Patel · Dwarkesh Patel · Watch video

I've had a lot of discussions on my podcast where we haggle out our timelines to hi. Some guests think it's 20 years away, others 2 years. Here's where my thoughts lie. As of July 2025, sometimes people say that even if all AI progress totally stopped, the systems of today would still be far more economically transformative than the internet.

I disagree. I think that the LLMs of today are magical, but the reason that the Fortune 500 aren't using them to totally transform their workflows isn't because the management there is too stodgy. Rather, I think it's genuinely hard to get normal humanlike labor out of these LLMs. And this has to do with some fundamental capabilities that these models lack.

Now, I like to think that I'm AI forward here at the Thor podcast. I probably spent on the order of 100 hours trying to build these little LLM tools for my post-production setup. and the experience of trying to get them to be useful has extended my timelines. I'll try to get an LLM to rewrite autogenerated transcripts for me to optimize for readability in the way that a human would be able to rewrite them or I'll try to get them to identify clips from a transcript that I feed in.

Sometimes I'll try to get them to co-write an essay with me passage by passage. Now, these are simple self-contained short horizon language in language out tasks. the kinds of assignments that should be death center in the LLM's repertoire. And there are five out of 10 of them.

Now, don't get me wrong, that is impressive, but the fundamental problem is that LLMs don't get better over time the way a human would. This lack of continual learning is a huge, huge bottleneck. The LLM baseline at many tasks might be higher than the average humans, but there's no way to give a model highle feedback. you're stuck with the abilities you get out of the box.

You can keep messing around with a system prompt, but in practice, this just doesn't produce anything close to the kind of learning and improvement that human employees experience. The reason humans are so useful is not mainly their raw intellect. It's their ability to build up context, to interrogate their own failures, and to pick up small improvements and efficiencies as they practice a task. ...

Why i don’t think agi is right around the corner

Why Humans Are More Useful Than Raw Intellect

What About Computer Use Agents?

His Timelines

The Discontinuity That Could Change Everything

Bottom Line

Deep Dives

Sources

Why i don’t think agi is right around the corner