Skip to main content

The Evolution of Human Languages

Final Report Summary - MOTHERTONGUE (The Evolution of Human Languages)

In the MotherTongue project, we set out to understand how human languages evolve at two very different timescales. One is the short timescale of how people come to select the words they use in their everyday speech, and how they reach agreement or consensus on those words. The other is the longer timescale of how languages diverge over centuries and even millennia – how do they retain traces of their ancestry for so long, what words and what sounds are likely to change faster than others, and can we reconstruct ancient languages from this knowledge.

Words are, for the most part, an arbitrary pairing between a sound and a meaning: the sound ‘couch’ is potentially just as good a sound as the sound ‘sofa’ and yet British English speakers now tend use ‘sofa’ to denote that piece of furniture while American English speakers prefer ‘couch’. Why? This question is also made interesting because there is no one who acts to coordinate the choices we make – we just make them spontaneously.

Using statistical modelling techniques adapted from the study of genes, we have discovered that speakers do not merely copy what others say. Rather, humans seem to adopt common forms and we do so disproportionately to how often we hear them. This means that once a particular form, such as ‘sofa’ begins to come popular people will tend to use it far more often. By this process, the word acquires a momentum that will eventually drive it to become the word that everyone uses. That momentum comes from our minds being actively converted to use this new word.

Once our minds have been converted this way, it becomes difficult for a new word to replace the old word, and this can explain why some words can last for centuries or even millennia. For example, the word ‘two’ in English is related to the sounds dos (Spanish), due (Italian), twee (Dutch), do (Hindi) as well as the German zwei.

Putting these results together we can see that no central coordinator is needed for the agreement that we call language to happen – language just emerges spontaneously. Our results not only have implications for understanding language learning, they are relevant to the emerging fields of robotics where computer scientists are trying to get robots to learn and use language effectively.

On the longer timescale of language divergence, we have been able to build a statistical model of how the typically 40-60 sounds or phones (basic sounds like ‘a’, ‘b’, ‘c’ and so on) that we use to construct our words, change or evolve over long periods of time. For example, the English ‘water’ shares some of its sounds, but not all of them, with the German ‘wasser’.

Our statistical models can be applied to collections of languages, and when we do this we find that consonant sounds tend to change to other consonant sounds, and vowels tend to change to other vowels, but that consonants and vowels rarely change to each other. We also find that sounds that are made closer to each other in our mouths are more likely to change to each other. Thus, a ‘p’ sound is more likely to change to a ‘b’ than a ‘p’ is to change to a ‘g’.

We find that these general rules hold across languages from around the world so that even though the words in these languages differ greatly, the way the languages evolve over long periods of time does not. Armed with this knowledge we have been able to reconstruct the deep history of language families, and in our project we did this for the Turkic langauges and the Bantu languages of Africa. We were also able to identify links among the many language families of Eurasia (from Western Europe to the Siberia), inferring the existence of a common language from which all these languages descend that was spoken around 15,000 years ago.

Linguists and historians can use these results to reconstruct ancient languages spoken thousands of years ago, or to put a date on ancient texts by studying their language. We, for example, we able to date Homer's Iliad to 762 BC by studying how his language differed from Modern Greek.