Alex Xu doesn't just explain how Google handles permissions; he reveals why the very concept of "instant access" is a carefully engineered illusion that breaks at the scale of billions of users. While most technical deep dives focus on database speed, Xu argues that the true bottleneck isn't storage—it's the terrifying complexity of ensuring that a revoked right disappears before the next byte of content is served. This piece matters now because as AI agents and third-party integrations multiply, the "plumbing" of trust is becoming the single point of failure for the entire internet economy.
The Core Problem: When Speed Meets Safety
Xu frames the challenge not as a storage issue, but as a temporal one. He writes, "The challenge multiplies at Google's scale. For reference, Zanzibar stores over two trillion permission records and serves them from dozens of data centers worldwide." This staggering volume forces a rethinking of how we define "access." In a small app, you check a list. At Google, checking a list is impossible because the list changes faster than it can be read.
The author identifies a critical flaw in traditional systems he calls the "new enemy" problem. "Consider the scenario where we remove someone from a document's access list, then add new content to that document," Xu explains. "If the system uses stale permission data, the person who was just removed might still see the new content." This is the crux of the argument: in a distributed world, "now" is different for every server. If the system doesn't track the exact order of events, security collapses into a race condition.
"Any delay in these checks directly impacts user experience."
Xu's framing here is sharp. He forces the reader to realize that security isn't just about being correct; it's about being correct fast enough that the user never notices the complexity. A counterargument worth considering is whether this level of rigor is necessary for smaller organizations, but Xu's point is that the principles of consistency apply regardless of scale, even if the implementation differs.
The Data Model: Tuples Over Lists
To solve this, the article details a shift from rigid access control lists to a flexible data model. Xu describes how Zanzibar represents all permissions as "relation tuples, which are simple statements about relationships between objects and users." He illustrates this with a simple format: object, relation, user. This abstraction allows the system to handle complex hierarchies without duplicating data.
"Instead of listing every member of a group individually on a document, we can create one tuple that says 'members of the Engineering group can view this document,'" Xu writes. "When the Engineering group membership changes, the document permissions automatically reflect those changes." This is a profound simplification. It moves the logic from the data storage to a configuration language, allowing rules to be composed dynamically.
The author highlights how this enables permission inheritance without data duplication. "Rather than duplicating the viewer list on every document, we write a rule saying that to check who can view a document, look up its parent folder, and include that folder's viewers." This approach transforms authorization from a static lookup into a dynamic calculation, which is essential for services like YouTube where a video's access might depend on its channel, its playlist, and its parent organization simultaneously.
Handling Consistency with Ordering
The most sophisticated part of Xu's analysis is the solution to the "new enemy" problem. He introduces the concept of "zookies," tokens that encode timestamps to ensure data freshness. "When an application saves new content, it requests an authorization check from Zanzibar. If authorized, Zanzibar returns a zookie encoding the current timestamp, which the application stores with the content," he explains.
This mechanism relies on Google Spanner's ability to provide external consistency across the globe. "Since the timestamp came from after any permission changes, Zanzibar will see those changes when performing the check," Xu notes. The brilliance lies in the flexibility: the system doesn't demand an exact timestamp, but rather a minimum freshness. "It specifies the minimum required freshness, not an exact timestamp," he writes. "Zanzibar can use any timestamp equal to or fresher than required, enabling performance optimizations."
"The zookie protocol has an important property. It specifies the minimum required freshness, not an exact timestamp."
Critics might argue that relying on a specific database technology like Spanner limits the portability of this architecture. However, Xu's point is that the principle of ordering events is universal, even if the specific tool varies. The real innovation is the decoupling of the permission check from the content storage, allowing them to coordinate without locking each other up.
The Architecture: Distribution and Caching
Finally, Xu breaks down the physical reality of running this system. "Zanzibar runs on over 10,000 servers organized into dozens of clusters worldwide," he states. The system is designed to handle "flash crowds" where a popular piece of content triggers millions of simultaneous checks. To prevent this from crashing the system, Zanzibar uses a lock table to deduplicate requests. "When multiple requests for the same check arrive simultaneously, only one actually executes the check. The others wait for the result, then all receive the same answer," Xu explains.
He also details how the system handles deeply nested groups using a component called Leopard, which precomputes transitive group membership. "Instead of following chains like 'Alice is in Backend, Backend is in Engineering,' Leopard stores direct mappings from users to all groups they belong to," he writes. This turns a slow recursive search into a millisecond set intersection.
The performance metrics are staggering. "Around 99% of permission checks use moderately stale data, served entirely from local replicas," Xu notes, with a median latency of just 3 milliseconds. "The remaining 1% requiring fresher data have a 95th percentile latency of around 60 milliseconds due to cross-region communication." This trade-off—optimizing for the common case of slightly stale data while guaranteeing correctness for the rare fresh case—is the defining characteristic of the system.
"Most importantly, Zanzibar illustrates optimizing for observed behavior rather than theoretical worst cases."
Bottom Line
Xu's analysis succeeds because it moves beyond the "how" of the code to the "why" of the design, proving that at massive scale, correctness and speed are not opposing forces but interdependent requirements. The piece's greatest strength is its demonstration that a flexible data model and a clever consistency protocol can solve the "new enemy" problem without sacrificing performance. The only vulnerability is the heavy reliance on Google's specific infrastructure, but the architectural lessons on tuple modeling and request deduplication are universally applicable for any engineer building distributed systems. As AI agents begin to manage permissions autonomously, understanding this balance between freshness and speed will be critical.