Import AI 437: Co-improving AI; rl dreams; AI labels might be annoying

Jack Clark's latest dispatch from the AI frontier cuts through the hype to reveal a profound tension: the industry is simultaneously racing toward autonomous self-improvement while desperately drafting blueprints for a safer, collaborative alternative. This isn't just technical speculation; it's a candid admission of anxiety from the very engineers building these systems, paired with a sobering look at how well-intentioned regulations can accidentally strangle innovation.

The Co-Improvement Paradox

Clark highlights a striking shift in thinking from Facebook AI Research, where researchers are arguing against the traditional path of self-improving AI. Instead, they propose a model where humans and machines evolve together. "Facebook researchers have said that building self-improving AI which eventually reaches superintelligence is 'fraught with danger for humankind - from misuse through to misalignment' and it'd instead be better to co-develop superintelligence," Clark writes. The proposal is to create a research agenda "targeting improving AI systems' ability to work with human researchers to conduct AI research together, from ideation to experimentation, in order to both accelerate AI research and to generally endow both AIs and humans with safer superintelligence through their symbiosis."

Import AI 437: Co-improving AI; rl dreams; AI labels might be annoying

The core of this argument rests on the belief that human oversight isn't a bottleneck, but a necessary stabilizer. Clark notes the researchers' thesis that "co-improvement can provide: (i) faster progress to find important paradigm shifts; (ii) more transparency and steerability than direct self-improvement in making this progress; (iii) more focus on human-centered safe AI." This framing is compelling because it reframes safety not as a brake on progress, but as a mechanism to ensure the progress is actually useful and controllable. However, Clark offers a sharp critique of the feasibility, likening the paper to a scene from The Wire where a criminal kingpin tells a guard, "You want it to be one way, but it's the other way." He suggests that while the idea of human-AI collaboration sounds ideal, "AI researchers, staring at the likely imminent arrival of automated AI R&D, articulate how things would be better and saner if humans could co-operatively develop future AI... But are they just grasping for a world that is unlikely to exist?"

Critics might argue that the momentum of automated research is already too strong to be slowed by human co-development, rendering this paper more of a wish list than a practical roadmap. The tension between the desire for control and the inevitability of speed is the defining drama of this field.

"AI researchers, staring at the likely imminent arrival of automated AI R&D, articulate how things would be better and saner if humans could co-operatively develop future AI and write a position paper about it. But are they just grasping for a world that is unlikely to exist and articulating their anxiety in the form of a position?"

The Hidden Cost of Good Intentions

Moving from theory to policy, Clark turns his attention to the practical nightmare of AI labeling regulations. While the concept of labeling AI systems—listing ingredients, uses, and safety warnings—seems straightforward, Clark warns that "an iceberg of complication lurks beneath this simple idea." He points to the European Union's experience with similar schemes, noting how "well-intended and equally simple labeling schemes from Europe have caused companies like Ikea to have to invest thousands of hours into compliance as well as things like revamping how they produce labels for their goods."

This section serves as a vital reality check for policymakers who often view regulation as a clean, binary switch. Clark argues that "most people who work in AI policy are pretty unaware of how expensive AI policy, once implemented, is to comply with." He describes this ignorance as a "fatal error," noting that industry insiders look at policy proposals with "a mixture of puzzlement and horror at the pain we are about to inflict on them and ourselves." The argument here is that the administrative burden of compliance can stifle the very innovation regulators hope to guide. A counterargument worth considering is that without strict labeling, the risks of unchecked AI deployment could outweigh the costs of compliance, but Clark's point remains: the economic friction of regulation is rarely calculated before the laws are passed.

The Return of Reinforcement Learning

The newsletter then pivots to a technical resurgence: the return of reinforcement learning (RL) in high-fidelity simulation. Clark introduces SimWorld, a new simulator built on Unreal Engine 5, designed to train AI agents in rich, procedurally generated environments. The system allows agents to "perceive rich multimodal observations (e.g., visual scenes, abstract layouts, and action feedback) and respond with high-level language commands." For instance, an agent might reason, "sit on the nearest chair," and the simulator handles the complex physics of getting there.

What makes SimWorld significant is how it bridges the gap between the old RL paradigm and modern large language models. Clark explains that earlier attempts at RL failed to produce general intelligence because agents started from a "blank slate," resulting in "terrifically expensive" systems that were merely great at playing specific games. Now, the approach has flipped: "the agents being developed in environments like SimWorld will typically be built on an underlying world model from a frontier AI system, like Claude or Gemini or ChatGPT, and SimWorld will be used to create more data to finetune this system on to make it more capable." This synergy allows for the study of "complex systems and emergent behaviors in rich, dynamic, and controllable environments," moving beyond simple game-playing to tasks like running a business or managing a career trajectory within the simulation.

From Virtual Games to Physical Robots

Finally, Clark examines DeepMind's SIMA 2, a project that embodies this new hybrid approach. By taking a Gemini-class model and fine-tuning it on gameplay data, DeepMind has created an agent capable of generalizing across unseen 3D worlds. The system's ability to self-improve is particularly notable; when faced with a new crafting menu in the game ASKA, a secondary model helped the agent bootstrap its own learning. "Through focused effort by the task setter, the agent was eventually able to acquire this skill," Clark writes, noting that the agent could eventually build a shelter in an hour, far outperforming its initial capabilities.

This is where the rubber meets the road. Clark posits that "research like SIMA 2 is the same sort of paradigm I expect people will use to teach robots to be able to do useful, open-ended things in our world." The logic is simple: train the AI in a safe, scalable virtual world, then transfer that competence to physical robots. While the authors admit the system still struggles with "very long-horizon, complex tasks" and has a "relatively short memory," the path forward is clear. "These results suggest a promising path toward using self-improvement to eventually bridge the virtual and physical worlds, enabling more capable physically-embodied agents in applications like robotics."

"By supporting advanced LLM/VLM-based agents and enabling large-scale, realistic agent–environment and agent–agent interactions, SimWorld expands the capabilities of modern agent-based simulation (ABS)."

Bottom Line

Clark's commentary succeeds in exposing the fragile balance between the ambition for superintelligence and the practical realities of safety, policy, and technical implementation. The strongest insight is the recognition that the industry's push for self-improvement is already outpacing its ability to govern it, making the proposed "co-improvement" model both essential and potentially obsolete before it begins. The biggest vulnerability remains the assumption that human oversight can be effectively integrated into a system designed to evolve faster than its creators can comprehend. Readers should watch closely as the gap between virtual simulation and physical robotics narrows, as that is where the theoretical debates of today will become the operational realities of tomorrow.