The $285B Sell-Off Was Just the Beginning — The Infrastructure Story Is Bigger.

This piece is about something much bigger than any single product or company. It's about an entirely new way for commerce and interaction to happen across the internet — and it's arriving faster than most expected.

The Convergence

Last Tuesday, three major announcements landed within hours of each other. Coinbase launched Agentic Wallets, crypto wallets designed not for people but for AI agents themselves. Cloudflare shipped Markdown for Agents, a feature that automatically converts any website into agent-readable markdown when an AI system requests it. And OpenAI published a developer blog post about skills and shell tools that let agents install software dependencies, run scripts, and write files inside hosted containers.

None of these companies coordinated their announcements. They didn't need to. They're all building toward the same future — an infrastructure layer forming under every agent that comes after it. A new kind of web where software reads websites as routinely as humans do.

The Money Layer

Agents can't do much on the web if they can't pay for things.

Coinbase's Agentic Wallet solved this on the crypto side using a protocol called X42, which has already processed over 50 million machine-to-machine transactions. That's not a typo — 50 million. The wallets come with programmable spending limits, session caps, and gasless trading on Coinbase's base network. Developers can spin one up in under two minutes with a command line tool.

The architecture is non-custodial, meaning even if the agent is compromised, the keys themselves sit in secure hardware that the agent cannot access. The agent can't leak those keys.

Within 24 hours of this launch, new AI agents registered wallets on Ethereum. That's not developer experimentation. That's an ecosystem of agents with wallets forming in real time.

Brian Armstrong's pitch was clear: "The next generation of agents won't just advise, they'll act." But what he didn't say explicitly is that the architecture implies agents with wallets will become real economic entities — they can earn, spend, and accumulate capital independently of the humans who created them. That's a category of software that has never existed before.

Stripe solved the same problem on the traditional payment side. Their Aenta Commerce suite, launched in December, allows businesses to connect a product catalog and start selling through AI agents with a single integration. They built a new payment primitive called shared payment tokens — scoped, time-constrained credentials that let an agent initiate a purchase using a buyer-saved payment method without ever seeing the card number.

Stripe's fraud detection system, Radar, had to be retrained from scratch because the old signals were all calibrated for human shopping behavior. Think about what that means. Decades of fraud detection machine learning built on patterns like mouse movement variability, browsing time, session behavior, device fingerprinting — all of it became useless when the buyer is software. Agent traffic doesn't move a mouse. It doesn't browse. It doesn't exhibit the behavioral variability that distinguishes a legitimate shopper from a bot.

Yet now bots are purchasers. Brands including Urban, Etsy, Coach, Kate Spade, and Revolve are already onboarding.

Google launched their agent payments protocol in September. PayPal and OpenAI partnered on instant checkout in ChatGPT. Visa built a trusted agent protocol at NRF 2026. Google announced the Universal Commerce Protocol, an open standard for agent-to-commerce interaction, and Stripe's ACS immediately auto-supports it — meaning merchants who integrated Stripe's agent tools are already compatible with Google's agent shopping infrastructure without writing one more line of code.

The industry consensus, as a Decrypt analyst put it: "Agents that can't spend money are fundamentally limited." That's true, but once you solve payments, there's a whole lot more down the road.

The Content Layer

The web is made of HTML, and HTML is designed for human browsers, not language models. Pages are bloated with scripts, tracking pixels, navigation menus, and ads. When an agent needs to read a web page, it has to strip all that stuff out — the stuff humans like — and convert it into something useful. Usually that's markdown.

This is such a common step that an entire category of companies like Firecrawl or Exa exists just to do that conversion.

Cloudflare's Markdown for Agents cuts out that middleman. When an AI agent requests a page on any Cloudflare-enabled site, it sends an accept header and Cloudflare intercepts the request, fetches the HTML from the origin server, converts it to markdown on the fly, and serves it back. The response even includes an X-Markdown-Tokens header with the estimated token count, so the agent can manage its own context window. No scraping anymore, no conversion libraries, no wasted compute. The agent just asks for markdown and gets markdown.

This matters more than it might sound. Cloudflare serves roughly 20% of the web. When they decide agents are first-class citizens of the web — which is what they just did — when they decide agents are not to be blocked but rather clients who should be served in their preferred format, markdown, they're making an infrastructure-level commitment to a world where software reads websites as routinely as humans do.

Cloudflare isn't stopping at markdown conversion. They launched three companion features in the same release. First, LLM.ext and LLM.ful — standardized machine-readable site maps that tell agents what's on a site and how to navigate it. Just like robots.txt told search engine crawlers the exact same thing two decades ago.

Second, Cloudflare launched AI Index. It's an opt-in search index where sites can make their content discoverable to agents directly through Cloudflare's MCP server and search API. That means they can bypass Google entirely.

Third and most telling: Cloudflare is including built-in X42 monetization support. So site owners can charge agents for content access using the exact same protocol as Coinbase's wallets. Cloudflare isn't just making the web readable for agents. They're building an economic layer for a web where agents pay to access content.

The Search Layer

Google Search is optimized for humans — ten blue links, ads, featured snippets, knowledge panels. Recently they added AI summaries. None of that is useful to an agent that needs to programmatically find specific information and then come back with structured data.

Exa.ai built a search engine from scratch specifically for agents. Their own index, their own neural retrieval models, their own embedding infrastructure. Their API returns raw URLs and content, not search engine result pages. Their research endpoint chains multiple searches together, agentically parallelizing across output fields to minimize latency. It scores 95% on Simple QA, a benchmark for factual accuracy. For comparison, Perplexity scores lower.

If you're thinking is this going to be a new bar for accurate agentic search? You'd be right.

But the benchmark results are much less interesting than what this implies about future internet market structures. Google built a search engine for humans and spent decades perfecting it. Now there's a parallel need — search for machines — and Google's architecture is the wrong shape for that. The companies that build agent-native search from first principles have an actual structural advantage, not just a marketing one.

An independent benchmark from AI Multiple tested the major agent search providers head-to-head. If Search led on a composite agent score, Firecrawl, Exa, and Parallel Pro were statistically tied behind it. But the latency spread tells you where real differentiation is starting to live. In an agent workflow, Brave returned results in 669 milliseconds — about two-thirds of a second. Parallel Pro took 13.6 whole seconds. In an agent workflow where each search is one step in a long chain, that latency difference compounds into minutes really, really fast.

The providers that own their own infrastructure and their own agentic index rather than wrapping Google's API have a structural speed advantage that grows much more valuable as agent workflows get more complex.

The Execution Layer

OpenAI's blog post on skills, shell, and compaction reads like a roadmap for turning agents into advisors and workers. Skills are reusable version instruction bundles — think of them as standard operating procedures for AI for a particular task. An agent can load them on demand, immediately learn the skill, and get going.

The shell tool gives agents a real terminal environment where they can install dependencies, run scripts, and write output files. Compaction manages the context window automatically so that long-running agent workflows don't crash when they hit token limits.

These details matter because they reveal OpenAI's bet about what agent architecture actually looks like in production. Skills aren't prompts. They're versioned. They're mountable instruction packages. They look more like Docker images than chat templates. An organization can build a Salesforce skill, test it, lock down the version, and deploy it across every agent in the company with a guarantee that every agent follows the same procedure. When the procedure changes, you just update that skill version and every single agent will follow. You don't have to mess with prompts or anything else.

That's the difference between artisanal prompt engineering and actual software engineering applied to AI operations.

The shell tool is equally telling. It gives agents a real Linux environment — not a sandbox playground, but a terminal where they can write files to disk and type commands like install, curl, and grep. The pattern OpenAI describes — installing dependencies, fetching external data, producing a real deliverable — is functionally identical to how a human freelancer works today. Human freelancers read the brief, set up the tools, do the research, and deliver the artifact. So do agents. The difference is the agent can now do it inside a container in just a few seconds. And skills ensure that it follows the same procedure every single time.

Glean is an enterprise search company and was an early skills customer. They saw accuracy on Salesforce-related tasks jump from 73% to 85% with a single well-structured skill. At the same time, it got faster because the agent wasn't thinking about what to do, and they saw about an 18% decrease in time to first token — which matters when every single query counts.

The gains come from moving stable procedures out of system prompt and into versioned modular instruction bundles, which is simply software engineering applied to AI workflows. We're not reinventing the wheel here. Everything that is revolutionary comes from second-order effects. All we're doing is a classic enterprise deployment except we're doing it with AI — we now have version control, testing, rollback. That part isn't new.

The part that's new is that we're doing all of this for autonomous AI agents.

Compaction handles server-side automatically and summarizes and compresses the context to keep the agent operational across workflows that would otherwise be impossible. It's the kind of feature that makes agents viable for tasks that take longer — like hours instead of just a few minutes. And that kind of sustained multi-step work at scale redefines how easily you can roll out agents across an enterprise environment.

The Emergent Web

What happens when you combine all these different primitives? An agent that has a wallet, search capabilities, content access, payment rails, and an execution environment is more than an assistant. It is an economic actor.

Consider what a developer calling himself Chat App demonstrated on X this week. He connected OpenClaw to Canny 2.0, which is a video generation model inside an app called Chatcut. Then he sent the agent an Amazon product link. The agent crawled the Amazon page, extracted product info and photos, identified which assets were suitable for video generation, fed them into SeedDance, which is an incredible video model, and produced a user-generated content style product video — the kind of content that brands pay creators a thousand bucks to produce.

No human touched any step between paste this link and here's your video. I watched it. It looks pretty good.

That is the emergent web. Not an agent doing a task, but agents chaining capabilities together across services to produce outputs that previously required multiple humans and multiple tools. The Amazon page wasn't designed for agents. Canny 2.0 actually wasn't designed to receive input from web crawlers. Chatcut wasn't designed as an orchestration layer, but because each piece exposed APIs, they snapped together into a new workflow in just seconds.

Bottom Line

The strongest part of this argument is the convergence thesis — multiple major infrastructure companies independently reached the same conclusions about agent payments, content access, and execution environments within the same couple of months. That suggests a structural shift rather than hype-driven timing.

The biggest vulnerability: it's unclear whether legal frameworks will accommodate agents as economic actors capable of earning, spending, and accumulating capital independent of their creators. That's an entirely new category of software that regulators haven't grappled with yet — and it may take years to sort out.

What readers should watch for next: how quickly enterprises adopt these agent primitives, and whether the legal framework catches up before agents become too autonomous to easily control.