Sora 2 is here, but not everyone is impressed. Some call it revolutionary; others see it as incremental. What's actually worth knowing about this release?
OpenAI has rolled out two versions of its new video generation model: standard Sora 2 and Sora 2 Pro — a higher-quality tier available to paying customers initially through sora.com and eventually in the app. The most viral demos floating around social media? Most appear to be from that premium Pro version, which costs significantly more to run. What regular users will access is considerably less impressive.
The rollout itself is deliberately constrained. The invitation system OpenAI described as "a bit jank" was actually designed to slow adoption — a safety-focused strategy. Currently only US and Canada residents can access it, with iOS premium tiers that will relax over time as new users join. No API yet, though that's promised soon.
OpenAI claims Sora 2 is intelligent in ways never before seen in video models. They call it the best world model they've built. But here's what's missing: image-to-video and video-to-video aren't permitted. These capabilities are coming later.
How It Actually Compares
The honest answer? We don't know definitively whether we're testing Sora 2 Pro, standard Sora 2, or Google's V3 preview — because all three have different quality levels, speed options, and versions. The same goes for Claude 4.5 Sonnet versus competing coding models.
One thing is clear: models are fundamentally dependent on their training data. When one model outperforms another on a specific prompt — say, a gymnast performing — that doesn't mean it's better overall. It might just have more training videos of that particular activity.
Sora 2 can generate anime noticeably better than V3. But again: think training data. The Cyberpunk example? They likely fed tutorials from that game into their training set.
One claim worth challenging: "Sora 2 mastering physics" is overstated. That viral video showing physics understanding looks more like a video game simulation than reality. Watch how he bounces off the hoop — it feels uncanny, not authentic.
The New Social Media App
OpenAI's Sora app is entering a crowded space. Meta's competing product, Vibes, was widely panned at launch. But OpenAI is making genuine differentiations: no infinite scroll for users under 18, nudges toward creation over consumption, visible and invisible watermarks on all outputs, strict opt-ins for likeness usage, input classification with potential blocking, and output reasoning models that decide whether content should be blocked.
The Cameo feature stands out. You can't simply upload a video of yourself — you must record specific phrases OpenAI asks, proving you're who you say you are before inserting your likeness into new videos. This prevents unauthorized deepfakes. It's a genuine safeguard in an era where the deepfake bar remains low.
OpenAI's blog post from 18 hours ago revealed more: periodic mood checks on how Sora affects users' wellbeing, and a centerpiece promise — that majority of users should feel their lives are better for having used Sora over the past six months. If not, significant changes follow. If they can't fix it, they'll discontinue the service.
Given OpenAI's track record with promises around AI regulation — where compliance was promised but lobbying followed — this commitment seems likely to quietly fade.
The Bigger Picture
Sora 2 feels like a side quest rather than direct progress toward AGI. What actually moves toward generalist agents? Physical science breakthroughs.
Periodic Labs aims to automate science, run experiments autonomously, with $300 million in funding. Their approach is fundamentally different: physical, real-world, not immediately available. The vision involves deep learning systems predicting experiment outcomes, humanoid robots conducting those experiments autonomously, and LLMs working with existing experimental data in accessible formats. An AI model optimized for literature review could identify the most promising experiments from thousands of papers never going to be read.
This contrasts sharply with Sora 2's entertainment focus. One is building simulators; the other is automating discovery.
The Benchmark Problem
OpenAI and Anthropic both claim their models are best-in-class. But different benchmarks show different winners. Where's the definitive proof that any single model dominates across all metrics? This remains unresolved — a genuine challenge for anyone claiming objective "best" status.
Testing Claude 4.5 Sonnet on simple benchmarks revealed a 54% improvement with thinking enabled, feeling comparable to Opus 4.1 but five times cheaper in actual use. Price breaks happen fast after each new breakthrough.
And Chinese competitors will likely release video generation models matching Sora 2 quality within three to six months — at significantly lower cost.
Where This Is Heading
The visual Touring Test is approaching: the point where viewers can't distinguish real footage from generated content. We're getting closer to that threshold. Soon, a button on your TV remote could insert your stored face as a selected character in any show you're watching — whether two years or four years away. Netflix becomes personal.
Counterarguments
Critics might note that OpenAI's promise of net-positive impact is impossible to verify independently. Without transparent metrics and third-party auditing, users have only the company's word. Others argue that focusing on Sora 2 as a "side quest" underestimates how quickly video generation could reshape entertainment, education, and communication — regardless of whether it advances AGI directly.
The visual Touring Test is approaching: soon viewers won't know if what they're watching actually happened.
Bottom Line
Sora 2 represents incremental progress, not the leap toward generalist AI that OpenAI markets. Its strongest elements are the safety features around likeness protection and deliberate rollout — genuine safeguards against misuse. But the claims of mastering physics are overstated, and the "net positive" promise carries little weight given past commitments quietly dropped. Watch for three things: whether Chinese competitors match quality at lower cost within six months, how OpenAI's promised wellbeing metrics get enforced, and when video generation passes the visual Touring test entirely — because that day approaches fast.