Wednesday, June 15, 2016

Recent Work on the Genetic History of Europeans

The science of genetics has been coming along by astonishing leaps and bounds in the last 10 years, and you can read some of the fascinating research findings in Gibbons (2014), Allentoft et al. (2015), Günther et al. (2015), Mathieson et al. (2015) and Hofmanová et al. (2016). Many of these studies are based on gene sequencing of ancient DNA in the remains of people who died thousands of years ago.

In terms of its population movements and descent, the facts appear to be that modern Europeans are a three-fold mix of three ancient populations as follows:
(1) Palaeolithic and Mesolithic hunter-gatherers who lived in Europe from c. 45,000 years ago;

(2) Neolithic Anatolian and Aegean farmers who migrated into Europe from c. 6,500 BC–4,000 BC, and

(3) Indo-European-speaking Yamnaya-culture people who swept into Europe from the Russian steppe from 3,000 to 2,000 BC.
We can examine the history of these groups in greater detail as follows:
(1) Palaeolithic and Mesolithic hunter-gatherers from c. 45,000 years ago
These were the earliest members of Homo sapiens in Europe; they were a hunter-gatherer people who lived in Europe from about 45,000 years ago during the end of the last Ice Age (which lasted from about 108,000 to 10,000 BC). They came from the Middle East along a Mediterranean route. But Europe must have been sparsely populated by these people: in essence, the earliest European hunter-gatherers must have been a relatively small population. These Mesolithic hunter-gatherers have contributed to modern European genetics, though to a different extent in different regions.

It appears that some of them interbred with the Neanderthals (who had in turn evolved from Homo erectus populations) (see here). But, even if true, the Neanderthal genetic contribution to modern Europeans is low, maybe as low as 1.5–2.1% (Prüfer et al. 2014). (For a useful family tree, see here).

There also seems to be some evidence that the mysterious Homo sapiens denisova lived in Europe in the Stone age.

At any rate, the Palaeolithic and Mesolithic hunter-gatherers appear to have had dark skin, which lightened by Darwinian evolution over the centuries, but perhaps accelerated by the adoption of farming which involved a reduced intake of vitamin D. Blue eyes may have evolved amongst these early European hunter-gatherers as well (see here and here).

(2) Neolithic Anatolian farmers from c. 6,500 BC–4,000 BC
From c. 6,500 BC–4,000 BC, Neolithic Anatolian farmers from northern Greece and north-western Turkey started migrating into central Europe through the Balkan route and then by the Mediterranean route to the Iberian Peninsula. They brought sedentary agricultural communities and new domestic animals and plants. Modern southern Europeans still seem to have inherited much more of their genes from these people. The original Anatolian farmer phenotype was probably similar to that of the modern people of Sardinia (Hofmanová et al. 2016: 3), and, generally speaking, the swarthy phenotype of southern Europeans is the legacy of their greater descent from the Neolithic Anatolian farmers as opposed to northern Europeans. Genetic analysis of ancient farmers seems to show that after their arrival in Europe the Neolithic Anatolian farmers only mixed infrequently and at low levels with the hunter-gatherers, but increasingly from the later Neolithic period (Hofmanová et al. 2016: 4).

(3) Indo-European-speaking Yamnaya-culture people from 3,000 to 2,000 BC
From 3,000 to 2,000 BC, there was massive migration of people from the South Russian steppe into central Europe, and then into northern and western Europe, and these people were of the Yamnaya culture north of the Black Sea. These people were almost certainly proto-Indo-European speakers (Balter and Gibbons 2015), cattle herders, and probably had a phenotype with brown eyes, pale skin, and taller height. It is also interesting – and not surprising – that the Caucasian Yamnaya-culture people have bequeathed to modern Europeans the trait of lactase tolerance (Allentoft et al. 2015: 171). The migration of the Yamnaya-culture people west and east spread the Indo-European languages (Allentoft et al. 2015: 171).
All modern indigenous Europeans (e.g., those not descended from the later invaders from the Eurasian steppe like the Magyars or other later arrivals) have a mix of genes from these three types of ancient people (see here). The distinctive European traits of blue eyes (from hunter gatherers), lactose tolerance (from the Yamnaya people) and fairer skin spread by interbreeding and natural selection (see here and here).

The genetic contribution of the Neolithic Anatolian farmers is important, but admittedly less so as you move northwards in Europe. However, even the Scandinavians have significant descent from the Neolithic Anatolian farmers, and even a marginal population like the Irish do as well (see here).

The further north you go in Europe, it appears the more is the genetic contribution of the Palaeolithic and Mesolithic hunter-gatherers.

And virtually everyone has some descent from the Indo-European-speaking Yamnaya-culture people. Linguistically, this third group is fundamentally important because virtually everyone in Europe now speaks an Indo-European language (apart from the Basques, Hungarians, Finns, Estonians, and other minor people).

The Indo-European Yamnaya-culture people of the steppe had themselves mixed with a population of hunter-gatherers isolated in the Caucasus region, so that the early Yamnaya pastoralists were a mix of Eastern European hunter gatherers and another group of hunter-gatherers from the Caucasus. These people then migrated back into Europe in a mass movement from c. 3,000 to 2,000 BC (Balter and Gibbons 2015: 815). For example, they flooded into eastern and central Europe and created the Corded Ware culture (c. 3100–1900 BC) (see the map here). Their descendants appear to have arrived in Greece from 2400–2000 BC bringing with them the Proto-Greek language that would evolve into Mycenaean Greek and then the later Greek dialects of Classical Greece.

Another fascinating part of forgotten history is how the Indo-European languages have displaced what almost certainly must have been non-Indo-European languages in Europe spoken by the Palaeolithic and Mesolithic hunter-gatherers and Neolithic Anatolian farmers and their descendants.

For example, as late as the early Roman Republic there appears to have been a large surviving group of non-Indo-European languages in Europe as follows:
(1) the proposed Vasconic group of languages, including the extinct Aquitanian language (from which the modern survivor Basque is derived), the Iberian languages, Tartessian, and possibly the Lusitanian language (see the map here);

(2) the Tyrsenian languages, including Etruscan, Raetic, and the Lemnian language;

(3) the Paleo-Sardinian language, the Sicanian language, the hypothetical non-Indo-European German substrate language, the pre-Greek substrate language, and the Eteocretan language, and perhaps the Pictish language.
At any rate, it is now firmly accepted that the ancient non-Indo-European Aquitanian language was the direct ancestor of modern Basque, and Aquitanian was spoken in large areas of south-western France, northern Spain and in the Pyrenees (Trask 1995: 87; see the map here). In turn, it would appear plausible that the ancient non-Indo-European Iberian language in Spain was related to ancient Aquitanian, particularly on the basis of recent evidence relating to the numerals of both languages (and, more speculatively, to Tartessian as well).

These mysterious languages seem to have been the descendants of the ancient language of the Neolithic Anatolian farmers.

Of course, some scholars like Colin Renfrew have proposed the Anatolian hypothesis and argued that the Neolithic Anatolian farmers already spoke Proto-Indo-European and hence brought Indo-European languages to Europe (see Renfrew 2003).

If true, then the Indo-European Yamnaya-culture people must have brought a later offshoot, perhaps a proto-Balto-Slavic language (Balter and Gibbons 2015: 815), and the non-Indo-European language substrate in Europe must have been descended from the languages of the Palaeolithic and Mesolithic hunter-gatherers.

That would make the Basque language a truly ancient language descended from the ancient Mesolithic hunter-gatherer languages of Europe.

The question of the origins and classification of Basque is very interesting indeed. Anyone who has a decent knowledge of European languages can see that Basque is an alien and very weird language, unconnected to any other language in Europe, as, for instance, in a random selection of Basque words:
Arrigorriagakoa (a surname)
Goikoetxea (“high lying house,” a surname)
Etxandi (another surname)
ilargi (“moon”)
arrantzale (“fisherman”)
eguzki (“sun”).
These words look bizarre to the modern European eye because Basque is clearly derived from some ancient non-Indo-European language. (In fact, having myself done some ancient Near Eastern languages, these words remind me of Sumerian or Akkadian).

But recent genetic study of both ancient and modern Basques suggests that they are mostly descended from the ancient Neolithic Anatolian farmers and so their mysterious language may well be derived from the ancient Neolithic Anatolian farmer language of the Middle East (Günther et al. 2015: 11920). If so, this suggests that the Anatolian hypothesis is wrong: the Yamnaya-culture people from the steppe were the proto-Indo-European speakers from whose language all other Indo-European languages in Europe have derived.

Allentoft, Morten E. et al. 2015. “Population Genomics of Bronze Age Eurasia,” Nature 522 (11 June): 167–172.

Balter, Michael and Ann Gibbons. 2015. “Indo-European Languages tied to Herders,” Science 347.6224: 814–815.

Gibbons, Ann. 2014. “Three-Part Ancestry for Europeans,” Science 345.6201 (5 September): 1106–1107.

Günther, Torsten et al. 2015. “Ancient Genomes link Early Farmers from Atapuerca in Spain to Modern-Day Basques,” Proceedings of the National Academy of Sciences 112.38: 11917–11922.

Haak, Wolfgang. 2015. “Massive Migration from the Steppe was a Source for Indo-European Languages in Europe,” Nature 522: 207–211.

Hofmanová, Zuzana et al. 2016. “Early Farmers from across Europe directly descended from Neolithic Aegeans,” Proceedings of the National Academy of Sciences of the United States of America, June 6

Jones, Eppie R. et al. 2015. “Upper Palaeolithic Genomes reveal Deep Roots of Modern Eurasians,” Nature Communications 6

Mathieson, Iain et al. 2015. “Genome-Wide Patterns of Selection in 230 Ancient Eurasians,” Nature 528.7583: 499–503.

Prüfer, K. et al. 2014. “The Complete Genome Sequence of a Neanderthal from the Altai Mountains,” Nature 505.7481: 43–49.

Renfrew, Colin. 2003. “Time Depth, Convergence Theory, and Innovation in Proto-Indo-European: ‘Old Europe’ as a PIE Linguistic Area,” in Alfred Bammesberger and Theo Vennemann (eds.), Languages in Prehistoric Europe. Universitätsverlag, Heidelberg. 17–48.

Trask, Robert Lawrence. 1995. “Origins and Relatives of the Basque Language: Review of the Evidence,” in José Ignacio Hualde, Joseba A. Lakarra, R. L. Trask (eds), Towards a History of the Basque Language. J. Benjamins, Amsterdam and Philadelphia. 65–99.


  1. Very interesting stuff here. Although, I thought it was undetermined if the Tyrsenian languages were completely unrelated to Indo-European or may have more distant connections (such as to Georgian or other Caucasian or Anatolian languages).

    That also makes me wonder whether or not the Trojan myths surrounding the foundation of Rome are true or not. As you know, the Hittite language is the oldest Indo-European and one of the better documented languages at that. The Hittites were "In the land of Hatti", and while a lot of names and places were in the old Hattian tongue, the Hittites displaced Hattian, and they are considered to be very different. Either way, in the Western portions of the Hittite Empire, there is also another decently well-documented language of Luwian, also Indo-European and closely related to Mycenean Greek and Hittite, and they believe Troy to have been speaking a dialect of Luwian (who was an ally of the Hittite Empire).

    So perhaps the Indo-European roots of Rome and the surrounding environment actually did come from a Trojan exodus after the destruction of Troy by the Greeks.

    Although who knows, as the Greeks had already heavily populated the Southern boot of Italy and Sicily by that point; and the Umbrian language family in Central, North-Central and South-Central was said to have been Indo-European as well - it could've come from Greek origins as well.

    Then again, the Lithuanian language has more similarities to Latin than it does with Greek (not to mention Sanskrit, which suggests the Yamnaya migration patterns west and east), so perhaps the migration did not effect Etruscan settlements (or bipassed them altogether, migrating via Dalmatia or Albania).

    Anyways, interesting stuff there. Also interesting that Celtic languages existed in Northwestern Spain, Gaul, and the British Isles, with Aquitainian, Iberian, and Vasconian languages separating Spain and Gaul Celts.

    1. (1) As far as I know the Tyrsenian languages (e.g., Raetic and Etruscan) are clearly not Indo-European. H. Rix (Rätisch und Etruskisch, 1998) shows Raetic is definitely related in some way to Etruscan, and in turn Lemnian (spoken on the Aegean island of Lemnos before 500 BC) is a cognate language and all probably stem from a proto-Tyrrhenian language of the Bronze age, probably from Anatolia.

      Speculative research (often dismissed by modern linguists) suggests that the Vasconic languages (Aquitanian, Iberians, etc.) are a branch of some larger ancient Anatolian/Near Eastern language family called the T Group Dene Caucasian:

      This T Group Dene Caucasian is supposed to include the Hattic, Hurro-Urartian, and Northeast Caucasian languages of the Near East.

      Is this the ancient language of the Neolithic Anatolian farmers?

      Does Hurro-Urartian (also part of this family) have some relationship to Sumerian? (which we know is another mysterious language isolate):

      (2) as for Trojan migration to Rome, maybe something like this happened, but Latin is clearly Italic Indo-European, not Luwian-derived.

      (3) it amazes me there isn't more work on the Vasconian languages and the non-Indo European substrate language, since these must be derived from very, very ancient Near Eastern languages.

    2. Thanks for the comment, by the way. I am surprised you have such knowledge of the subject!

    3. Interesting stuff here.

      1) Etruscan is an Agglutinative language, like Sumerian, Georgian, or what we believe to be Hattic. This suggests that there might be some similarity and connection between these different languages, perhaps suggesting that the Sumerian-based people were the kinds of people behind the Anatolian-Aegean migration. It would be interesting if some more work were done on possible connections between Hattic, Georgian/Kartvelian, Sumerian and Etruscan. Of course, Etruscan, like any other language at the time, probably had a lot of intermixing between newer waves of Indo-European migration, whether that came from the Gauls, the ancestors of the Italic Indo-Europeans (and possibly the Luwians), the Greeks and also of course the Semitic Phoenicians.

      2) True, it certainly seems that way, and without a doubt the survivors of a Trojan exodus could not have been very numerous (certainly less than 10,000). And of course it's likely that they would have largely blended in to the existing language at the time (think: Bulgars --> Slavic or Franks --> Vulgar Latin) rather than overtaking it wholesale and establishing lingual hegemony (think: Hungarian). Still, it raises some questions about the Italic languages we're aware of: Why did Etruscan survive, while the rest of Central Italy speak Italic Indo-European? What was this Italic Indo-European based off of: Greek Colonists or other migration patterns? And what were the main differences between Latino-Faliscan languages and the rest of the Italic IE languages (and why - which is where that Luwian hypothesis could enter)?

      3) Definitely, and that should also include further work on Etruscan languages as well (and probably Phonecian too). If only there were more public and private funding for these kinds of endeavours - perhaps these are the kinds of things humans will be paid to do to maintain full employment after robotization displaces more manufacturing and services!

      By the way, I once thought of trying to learn Sumerian/Akkadian. Is it a rewarding intellectual endeavour, or just lost time?