The remarkable computers built not to fail

In an era where cloud outages make headlines and financial transactions hang in the balance of milliseconds, Asianometry resurrects a forgotten chapter of computing history to reveal a radical truth: reliability isn't an add-on, it's a design philosophy. The piece argues that the modern digital economy's backbone wasn't built by the giants of the 1970s, but by a scrappy startup that decided to build computers that literally could not fail.

The Batch Processing Trap

Asianometry sets the stage by exposing a critical vulnerability in the 1970s computing landscape. While IBM dominated with massive mainframes, the industry relied on "batch processing," where data was collected and processed overnight. This created a dangerous lag. The author illustrates this with a vivid anecdote from Jim Tribig, a former Hewlett Packard manager who saw the real-world cost of this delay at a Holiday Inn. "Tribig in an oral history for Stanford recall seeing the major implications at the Holiday Inn... The hotel needed the computers to deal with a dine and dash problem where customers would eat at the hotel breakfast and then immediately check out. Because the breakfast order had not yet been processed into the customer's account, it would have happened overnight via the batch processing job. The hotel did not know to charge the customer for the meal."

The remarkable computers built not to fail

This wasn't just an accounting error; it was a systemic fraud vector. Asianometry notes that thieves exploited similar gaps in early Automated Teller Machines (ATMs), which were often "offline" and disconnected from central ledgers. The core argument here is that as business moved from back-office support to front-line operations, the tolerance for downtime evaporated. The author writes, "Once you place your business in a computer systems metaphorical hands, then it, as like someone says in Lord of the Rings, cannot fail." This framing effectively shifts the reader's perspective from viewing computers as tools to viewing them as the very infrastructure of trust.

Critics might argue that the focus on Tribig's specific anecdotes overlooks the broader technological constraints of the era, but the narrative successfully demonstrates that the problem wasn't just hardware limits—it was a fundamental mismatch between legacy architecture and new economic realities.

The Architecture of Redundancy

When Hewlett Packard rejected the idea of a fault-tolerant machine, Tribig turned to venture capitalist Tom Perkins. Asianometry details how this partnership birthed Tandem Computers, a company built on a philosophy of radical redundancy. The author explains that while competitors tried to patch single-processor systems, Tandem was designed from the ground up as a multi-computer system. "An airplane could have one engine and go 600 m an hour, or it could have two engines and go 600 mph, but if one failed, it could go 300 m an hour," Tribig is quoted as saying.

This analogy is the piece's conceptual anchor. It reframes computing power not as a singular peak performance metric, but as a resilient, distributed network. Asianometry highlights the hardware innovation: each module had its own processor, memory, and power supply. "They even all have their own power supply and battery backup so that power can be cut to one module without affecting the others. This makes it easy to replace a broken module. Just take it out and replace it with a new one."

The idea is to have modules that work together to do the workload and you have to buy a computer for the peak workload. So if it fails most of the time it doesn't even slow down.

The commentary here is sharp: Tandem didn't just buy two computers; they built a system where failure was expected and managed. The author notes that the real secret sauce was the software, specifically the "Guardian" operating system, which utilized "process pairs." Every program ran on two modules simultaneously—a primary and a backup. If the primary failed, the backup took over instantly. Asianometry describes this as a "regroup algorithm" where modules vote on the status of their neighbors, a process the author amusingly likens to an emergency meeting in the game Among Us. This comparison, while pop-culture heavy, effectively demystifies a complex consensus protocol for a general audience.

The ATM Revolution

The piece pivots to the market validation of this technology, identifying the ATM boom as the catalyst that saved the banking industry from its own inefficiencies. Asianometry argues that the 1970s inflationary environment forced banks to seek automation to cut labor costs. "In the second half of the 1970s, ATMs turned from a simple carnival show curiosity to a genuine costs tool." The reliability of Tandem systems allowed banks to trust machines with real money, leading to an explosion in adoption.

The author provides compelling data to support this trajectory: "In 1978, America had less than 10,000 ATMs. By 1990, there were over 80,000 ATMs, facilitating 450 million debit transactions each month and driving Tandem forward." This growth wasn't just about convenience; it was about the creation of electronic fund transfer networks that spanned the country. Asianometry points out that once courts ruled ATMs weren't bank branches, networks like Star and Cirrus could expand across state lines, making the 24/7 availability of Tandem systems not just a luxury, but a regulatory and operational necessity.

A counterargument worth considering is whether the author overstates the uniqueness of Tandem's software approach. Other companies were experimenting with clustering, but Asianometry effectively counters this by emphasizing the "turnkey" nature of Tandem's solution, which removed the burden of custom software development from the banks. The author writes, "What people needed was a system built and designed from the ground up for fault tolerance and also protected users data. So, how about if someone just sold them something turnkey?"

Bottom Line

Asianometry's strongest contribution is reframing the history of computing not as a race for speed, but as a struggle for continuity. The piece convincingly argues that the modern financial infrastructure exists because one company decided that a crash was an unacceptable design flaw. However, the narrative's focus on the 1970s and 80s leaves the reader wondering how this philosophy translates to today's cloud-native, microservices world, where failure is often accepted as inevitable. The lesson remains vital: in critical systems, resilience must be architected, not patched in.

The remarkable computers built not to fail

by Asianometry · Asianometry · Watch video

In the late 1970s, Tandem computers exploded onto the scene with a remarkable product. Computers designed and built not to fail, the Tandem non-stop. With their legendary reliability, Tandem's computers ran in banks, market exchanges, and critical industries. In this video, the rise and fall of Tandem computers.

And yes, they are still around today. This video is brought to you by the Asianometry Patreon. In the early 1970s, Jim Tribig was working as a marketing manager at the American technology giant Huelet Packard. Born in Texas, Tribig joined Huelet Packard after some time at Texas Instruments to help sell their new commercial mini computer, the HP 3000.

His manager was a guy named Tom Perkins. Hm, that last name sounds familiar. I wonder what venture capital firm he will found later. Anyway, at HP, Tribig, the last name appears to be of German origin in case you're wondering, competed with the commercial computing colossus IBM.

It is hard to convey just how difficult that was back then with toptobottom vertical integration, seemingly unlimited resources, powerful software lockin, titanic economies of scale, the massive sales force, and sterling reputation, IBM felt unassalable. Add the fact that the early 1970s were an economically challenging time. high energy prices, inflation, and interest rates at 21%. Who can afford to develop a new computer?

By 1974, mainframe pretenders like GE, NCR, RSCA, and Seammens had all evacuated the dance floor. IBM was eating the computer world. Survival meant finding a niche that IBM either could not compete in or did not care to, like digital equipment or deck, which built a strong position in smaller minicomputers like the Vax. But what can that niche be?

The late 1960s and early 1970s changed the role of the computer in the back office. Computers had existed of course but up until then its role in the back office was a supporting one. Humans did the actual processing work during the day and the computer handled background processes in batch jobs that run overnight or during off hours. Examples of typical tasks would be like bookkeeping of the day's transactions, consolidating customer accounts and whatnot.

However, using the computer this way led to several nagging problems. Tribig in an oral history for Stanford recall seeing the major implications at the Holiday Inn. Chingi and Snoop would approve. He and his team at HP, Tribig, not Chingi, ...

The Batch Processing Trap

The Architecture of Redundancy

The ATM Revolution

Bottom Line

Sources

The remarkable computers built not to fail