What if the biggest problem with artificial intelligence isn't how smart it is—but how hard it is to teach? Dwarkesh Patel spent 100 hours trying to build LLM tools for his post-production workflow and found something unexpected: current AI models are magical in their capabilities, but they can't learn from feedback the way humans do. In this piece, he makes a case that's been missing from most AI coverage—explaining why today's systems hit a fundamental wall that no amount of scaling can solve.", ## The Magical Limitation
Dwarkesh Patel has spent countless hours experimenting with large language models trying to build tools for his post-production setup. He uses them to rewrite autogenerated transcripts, identify clips from transcripts, and co-write essays passage by passage.
These are simple, self-contained tasks that should be central to an LLM's repertoire—and they're impressive. But here's the fundamental problem: LLMs don't get better over time the way a human would.
This isn't a minor inconvenience. It's a massive bottleneck. The baseline capability of these models might be higher than an average human on many tasks, but there's no way to give them high-quality feedback and let them improve. You're stuck with whatever abilities come out of the box. You can keep messing around with system prompts, but in practice, this doesn't produce anything close to the kind of learning and improvement that human employees experience.
Why Humans Are More Useful Than Raw Intellect
The reason humans are so useful isn't mainly their raw intellect—it's their ability to build up context, interrogate their own failures, and pick up small improvements as they practice a task.
Patel uses a vivid analogy. How would you teach a kid to play the saxophone? You'd have them try to blow into it, see how it sounds, and adjust. Now imagine teaching saxophone differently: a student takes one attempt, and the moment they make a mistake, you send them away and write detailed instructions about what went wrong. You call in the next student who reads your notes and tries again.
This just wouldn't work. No matter how well-honed your instructions are, no kid is going to learn to play the saxophone from reading them.
But this is the only modality we have to teach LLMs anything.
Yes, there's reinforcement learning fine-tuning, but it's not a deliberate adaptive process in the way that human learning is. Patel notes that his editors got extremely good not because he built bespoke RL environments for their different subtasks—they simply noticed small things themselves and thought hard about what resonates with the audience.
What About Computer Use Agents?
When researchers from Anthropic suggested that reliable computer use agents should arrive by the end of next year—where you could tell an AI to go do your taxes, have it read through your emails, Amazon orders, Slack messages, compile receipts, decide what are business expenses, ask for approval on edge cases, and submit form 1040 to the IRS—Patel is skeptical.
One reason: as horizon lengths increase, rollouts have to become longer. The AI needs to do two hours worth of agentic computer use tasks before we can even see if it did it right. Computer use also requires processing images and video, which is already more computationally intensive.
Two, we don't have a large pre-training corpus of multimodal computer use data. There's no shortage of internet text to train language models, but getting models to become reliable, competent agents in other domains required different kinds of data that simply aren't available.
Three, even algorithmic innovations that seem simple in retrospect took a long time to iron out. The RL procedure explained in DeepSeek's R1 paper seemed simple at a high level, yet it took two years from GPT-4 to the release of O1.
His Timelines
Patel doesn't expect to see some OpenAI livestream where they announce that continual learning has been totally solved. When he interviewed Anthropic researchers, they said they expect reliable computer use agents by the end of next year—but he's not convinced that's realistic.
He offers his own timeline estimates:
For an AI that can do taxes end-to-end for a small business as well as a competent general manager could in a week—including tracing down receipts on different websites, finding missing pieces, emailing back and forth with people, filling out forms, and sending it to the IRS—he says this would come around 2028.
For when AI will be able to learn on the job as easily, organically, and seamlessly as humans do for any white-collar work—for example, if he hired an AI video editor after six months, it would have as much actionable deep understanding of his preferences, their channel, and what works for the audience as a human would—he says this would come in 2032.
The Discontinuity That Could Change Everything
While this makes Patel bearish about transformative AI in the next few years, it makes him especially bullish on AI over the next decades. When we do solve continual learning, we'll see a huge discontinuity in the value of these models.
Even if there isn't a software-only singularity where these models rapidly build smarter and smarter successor systems, we might still get something that looks like a broadly deployed intelligence explosion. AI will be getting broadly deployed through the economy doing different kinds of jobs and learning while doing them in the way that humans can.
However, unlike humans, these models can amalgamate their learnings across all their copies. So one AI is basically learning how to do every single job in the economy.
An AI capable of this kind of online learning might rapidly become a superintelligence even if there's no further algorithmic progress.
"One AI that could learn like this would basically be learning how to do every single job in the economy."
Critics might note that this view underestimates how quickly RL fine-tuning and other techniques have advanced. Some researchers argue these methods are already producing meaningful improvements in sample efficiency, possibly reducing the data hunger he worries about. Others contend that computer use demonstrations show genuine progress even if end-to-end tax completion isn't fully reliable yet.
Bottom Line
Patel's strongest argument is that continual learning—the ability to build context and improve from feedback—is a fundamental wall that current AI architectures can't easily cross. His biggest vulnerability is that he may be underestimating how quickly reinforcement learning fine-tuning and other techniques are advancing. The technology landscape shifts fast, and the bottleneck he's worried about might prove more solvable than he expects.
What readers should watch for: any breakthrough in giving AI systems persistent memory within a session, or new approaches to high-quality feedback loops that let models improve between tasks. The gap between current capabilities and human-like learning is real—but whether it takes seven years or two to solve is where the debate lives.