← Back to Library

Two steppes forward, one step back: Parsing our indo-european past

Razib Khan delivers a rare intellectual payoff: a definitive answer to a 200-year-old mystery about human history, solved not by more debate, but by the silent, irrefutable testimony of ancient DNA. While linguists and archaeologists spent centuries arguing over maps and pottery, the genome has finally provided the scaffold to settle the question of how a tiny group of nomads reshaped half the world. This is not just a history lesson; it is a correction of the record that demands we rethink the very nature of cultural dominance.

The Genetic Revolution

Khan opens with a personal anecdote about his childhood fascination with the vast geographical spread of Indo-European languages, from English to Bengali. He recalls the moment of clarity: "Once you have seen, you cannot unsee." This sets the stage for a narrative where the mystery of human connection is finally resolved by science. The author argues that for decades, the field was stuck in a stalemate between linguistic models and archaeological theories, but the intervention of paleogenetics has "supercharged the rate of new discoveries and fast-tracked revisions of our understanding toward a final draft."

Two steppes forward, one step back: Parsing our indo-european past

The core of Khan's argument rests on the sheer scale of the demographic shift revealed by genetic data. He notes that researchers have "finally been able to arrange that carefully collected accumulation of facts from linguistics and archaeology on a rigorous phylogenetic and demographic scaffold." This shift from speculation to hard data is the piece's greatest strength. It moves the conversation from "maybe" to "definitely." The evidence shows that the Yamnaya people, a small group of pastoralists from the Pontic steppe, did not just spread their ideas; they spread their genes.

"The game changer of paleogenetics here lay not in complex and formal model-building: but in the devastating simplicity of DNA reads."

This line captures the essence of the scientific breakthrough. Khan explains that while earlier theories suggested an "elite transmission" where a small group of horsemen imposed their language without significant biological mixing, the DNA tells a different story. The genetic record shows "90% population replacement in Britain around 2500 BC" and a total supplanting of Neolithic societies in Scandinavia. This is a stark, almost violent, picture of expansion that overturns the gentler theories of cultural diffusion.

Critics might argue that focusing so heavily on genetics risks reducing complex cultural histories to mere biological determinism, potentially ignoring the agency of the populations being replaced. However, Khan addresses this by emphasizing that the genetic data provides the framework upon which cultural and linguistic history is built, rather than replacing it entirely.

The Scale of the Expansion

The sheer improbability of this expansion is what makes Khan's analysis so compelling. He points out that the entire modern Indo-European speaking world, representing over 3 billion people, traces its roots back to a founding population that was "perhaps just a few allied tribes." Khan estimates that "as few as 10,000 nomads scattered between the Dnieper and Don rivers... were about to run the table from the Altai to the Atlantic."

This framing challenges our intuition about power and population. Khan writes, "Contemporary observers, like their neighbors the Cucuteni-Trypillia people, who built some of the largest towns of the late Neolithic, likely did not see the scattered nomadic Yamnaya as a particularly portentous or formidable horde." Yet, within a millennium, these nomads reshaped the continent. The author uses this to dismantle the idea that cultural dominance requires a massive, pre-existing population advantage. Instead, it suggests that specific technological or social innovations—perhaps the horse and the wheel, or a new social structure—allowed a small group to leverage a massive demographic advantage.

"All this from a very small founding population, perhaps just a few allied tribes... This means 5,000 years ago... as few as 10,000 nomads... were about to run the table from the Altai to the Atlantic."

Khan also addresses the distribution of this ancestry, noting that the expansion was not uniform. He highlights that while the top ten Indo-European languages are spoken by billions, the "margins" of the map—Western Europe and the Indian subcontinent—saw the most dramatic expansion. In contrast, the Baltic languages, once dominant, are now confined to a tiny fraction of their former range. This observation serves as a reminder that historical success is often a matter of ecological opportunity and timing, not inherent superiority.

The Verdict on the Past

The piece concludes with a sense of finality that is rare in historical scholarship. Khan asserts that "it is defensible to argue that we have made more progress in the last decade than in the previous two centuries toward understanding Indo-European prehistory." The convergence of linguistics, archaeology, and genetics has created a coherent narrative that explains the "how" and "where" of the Indo-European expansion with unprecedented clarity.

The author emphasizes that the sequence of migrations is now "finally clear, written in genes, and told in the rise and fall of successive peoples." From the arrival of the Corded Ware culture in Poland to the abrupt shifts in Britain and Scandinavia, the genetic record provides a timeline that aligns with, and often corrects, the archaeological record. The evidence is so overwhelming that it "fairly demands a final verdict."

"Sometimes the evidence is so overwhelming, it fairly demands a final verdict."

This confidence is well-earned. Khan's synthesis of the latest genetic studies, particularly those from the Reich lab, provides a robust foundation for understanding human prehistory. While future discoveries may refine the details, the broad strokes of the Yamnaya expansion are now undeniable.

Bottom Line

Razib Khan's piece is a masterclass in synthesizing complex scientific data into a compelling historical narrative, successfully using ancient DNA to settle a centuries-old debate. Its greatest strength lies in its ability to transform abstract linguistic theories into concrete demographic realities, proving that the spread of language was inextricably linked to the movement of people. The only vulnerability is the potential for this genetic determinism to be misused to support outdated notions of racial hierarchy, a risk the author navigates carefully but which remains a critical context for readers to hold in mind. The verdict is clear: the past is no longer a matter of speculation, but a story written in our very genes.

Sources

Two steppes forward, one step back: Parsing our indo-european past

In 1985, I flipped open a dictionary in my elementary school library, and became completely distracted by a map in the front matter illustrating the distribution of modern Indo-European languages. I was nine years old and this was the first time I saw the term “Indo-European.” Both the term and the map perplexed me. Included were the two languages I knew: English and Bengali, the northwesternmost and easternmost of the Indo-European languages, respectively. What could possibly connect them across that vast geographical span? I certainly had never noted any similarities…until I paused to take a closer look. That weekend, library card in hand, I trudged off to the public library, thumbed through the card catalog until I found the entry for “Indo-European,” inspected it and followed it to the linguistics section. I was already a habitué of the adults’ section, but so far, had solely explored the science stacks. That day, I pulled down a tome whose details I scarcely recall, unfamiliar matters of philology mixed with prehistoric speculation. What I do remember to this day is that inside that doorstopper was a wealth of maps, language-family trees and long lists of word-comparisons laid out in tables (what I know now to be swadesh lists). Seeing the similarities in the core words across Indo-European languages explicitly outlined, the scales fell from my eyes. Below are some typical cognates in English, Bengali and Proto-Indo-European (PIE):

Mother, mā and *méh₂tēr.

Father, pitā and *ph₂tḗr.

Name, nām and *h₁nómn̥.

New, notun and *néwos.

Nose, nāk and *néh₂s.

Door, dorja and *dʰwer-.

Mind, mon and *men-.

Mouse, mushik and *muh₂s.

Serpent, sap and *serp-.

Deity, debôtā and deywós.

Once you have seen, you cannot unsee.

More than 40% of humans alive today speak an Indo-European language as their mother tongue, some 3.4 billion people (and well north of 50% if you count second-language learners). The top ten are:

Spanish ~484 million.

English ~390 million.

Hindi ~345 million.

Portuguese ~250 million.

Bengali ~242 million.

Russian ~145 million.

Punjabi ~120 million.

Marathi ~83 million.

Urdu ~78 million.

German ~76 million.

It is notable that, for raw numbers, being on the margins seems to have redounded to the benefit of expansionist Indo-Europeans. Except for Russian, all top ten Indo-European languages count speakers positioned around the map’s fringes: the Indian subcontinent and Western Europe. In contrast, the Baltic languages, whose domains once stretched some 750 miles from northern ...