Skip to main content

Syntax shaped by cognition: transforming theories of syntactic systems through laboratory experiments

Periodic Reporting for period 2 - SYNCOG (Syntax shaped by cognition: transforming theories of syntactic systems through laboratory experiments)

Reporting period: 2019-08-01 to 2021-01-31

"Human language is incredibly diverse: languages differ at all levels of linguistic structure from phonetics to syntax. But behind these differences there are intriguing similarities, patterns that reappear across many languages, and other that rarely crop up. A foundational goal of linguistics is to distil a set of principles explaining the shared features of our languages by appealing to properties of the human cognitive and linguistic systems. While many such principles have been formulated, throughout the history of the field little direct behavioural evidence has been offered for them. Indeed, the connection between common features of language systems and cognition is controversial in the broader community of scientists studying language from different perspectives. This project develops innovative experimental paradigms to investigate hypothesized constraints on language directly across a numbers of domains in linguistics. These include recurring patterns in person and number systems (e.g. instantiated in pronoun paradigms like English ""I"", ""you"", ""they""); common strategies for morpheme ordering (e.g. number morphemes come closer to the noun than case morphemes); structural tendencies for ordering words (e.g. consistent ordering of syntactic heads and dependents); and more. The experimental paradigms we use explore how artificially constructed linguistic mini-systems are learned and how new such systems are innovated or created in the lab. We look at how cognitive development affects these processes, and explore whether they are part of a specialized system for language in the human brain, or instead reflect more general cognitive mechanisms. Understanding why there are common patterns among the world's languages in these domains will allow us to develop more robust, empirically-based theories of human language."
Since the start of the project, we have initiatiated experimental investigations of linguistic regularities in 8 different linguistic phenomena. In all cases, our team is the first to apply artificial language learning techniques, and the first to report behavioral evidence bearing on how the cognitive system shapes language in these domains. Our main results so far are outlined below.

(1) Theories of person/number systems (e.g. instantiated in pronoun systems like English 'I', 'you', 'they') typically posit features designed to represent all and only the types of systems found across languages. For example, a pronoun like 'I' has been claimed to be represented as +speaker, –addressee; 'you' as –speaker, +addressee; 'they' as -speaker, -addressee. However, there has been no clear behavioral evidence that people in fact represent person/number in terms of features. We have now provided the first such evidence, showing that adult learners trained on miniature artificial pronouns systems behave in a way that accords with the predictions of these feature-based theories. We also find they are more likely to infer pronominal paradigms with homophony (shared forms for distinct meanings) that targets featurally-related meanings. These preferences are generally in line with the (sparse) cross-linguistic frequency of different pronoun paradigms, but also provide some challenges to existing theories.

(2) Using artificial language learning experiments, we have shown that adult and child learners use distinct strategies for learning noun classification systems (e.g. grammatical gender): adults relay more on semantic cues to class (e.g. animacy), while children rely more on form-based (phonological) cues (e.g. noun endings). This supports observations from natural language learning, in which children appear to over-rely on phonological cues to class, and suggests a path for how noun classification systems evolve from semantics-based to form-based over time.

(3) In the vast majority of conventionalized languages, nouns and nominal modifiers are ordered such that adjectives are closest to the noun (directly before or after), demonstratives are farthest from the noun, and number words come in between. For example, English has 'these three spotted pencils', and Thai has the equivalent of 'pencils spotted three these'. We have found that when adults improvise novel systems of communication in the lab using their hands (the 'silent gesture' paradigm), they order information in a way that mirrors this, suggesting that the origins of this tendency among languages may come from how these elements are represented conceptually. We further show that this kind of conceptual representation can be learned from observing objects and their properties in the environment; using corpora as a proxy for the world, we find that an information-theoretic measure of strength of associate reveals strong associations between adjectives and nouns, weaker associations between numerals and nouns, and very little association between demonstratives and nouns. This provides evidence to the learner of an underlying structure which can then be used to generate the observed frequent word order patterns like English and Thai.

(4) One of the best known linguistic regularities relates to how languages order syntactic heads and dependents across different types of phrases. For example, a verb is the syntactic head in a verb phrase, and it takes an object noun as it's modifier (e.g. 'eat an apple'). Similarly, a preposition is the syntactic head in a prepositional phrase, and takes a noun as it's modifier (e.g. 'kick the ball'). The languages of the world tend to consistently order these elements: if the verb is first in the phrase, then so is the preposition, and vice versa. This is called cross-category 'harmony'. Linguists disagree about whether harmony in language is the result of a cognitive bias for harmonic (consistent) order, or due to processes of language change independent of the cognitive system. We have found evidence that learners acquiring a miniature artificial language prefer harmonic patterns, suggesting a role for cognition (via learning) in explaining this well-known tendency.

(5) In languages with rich inflectional morphology (e.g. systems of prefixes or suffixes indicating information like person, number, gender, case, etc.), certain ordering regularities have been observed. For example, number tends to be ordered closer to the noun stem than case; person tends to be ordered linearly first, before number. However, sparse cross-linguistic data make it difficult to determine whether these patterns are reliable, and to-date there is no evidence that they reflect pressures coming from the human cognitive system. We have used laboratory experiments with miniature artificial languages to show that indeed, adult learners prefer systems of order that accord with these typological observations, suggesting that patterns of morpheme order are influenced by the biases of language learners.
This project has as its main aim to bring novel experimental evidence to bear on longstanding debates concerning whether and how languages are shaped by the human cognitive or linguistic system. All of the behavioral findings outlined above are thus beyond the state of the art of linguistics, which has traditionally used cross-linguistic samples as evidence for constraints on our linguistic capacity. While the kinds of experiment we conduct are more widespread in work on phonology (sounds systems), our project is the first to develop and use these methods widely in syntax and morphology. In most cases, our results provide the first clear behavioral connecting specific linguistic regularities to mechanisms of human learning. We expect to continue generating these kinds of novel results. In the coming years, we will also explore whether there are developmental differences in learning that have implications for understanding how cognition is linked to language evolution and change--it is through these processes that potentially subtle biases in our cognitive system come to shape language over time. In addition, we will explore whether some (or all) or the biases we uncover might be domain-general; although mainstream linguistic theories often posit constraints on the generative capacity of the human language faculty, it is very plausible that the forces shaping language are not unqiue to any language faculty, but represent general cognitive pressures that would be observed in processing or learning of non-linguistic patterns. Both of these upcoming strands of the project use the cutting-edge methodologies we have been developing thus far, and will provide novel sources of evidence to push the field forward.