Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Mechanisms of Early Language Acquisition

Periodic Reporting for period 1 - MELA (Mechanisms of Early Language Acquisition)

Reporting period: 2023-01-09 to 2025-01-08

One of the most exciting moments in every child’s development is when they start to talk. It is astounding to observe the speed with which young children add words to their growing vocabularies. Things become even more interesting when they start to communicate their ideas by combining words – “Baby drink”. This signals that the child has not only learned the individual words, but also the links that connect words based on their meaning. Successful communication in humans critically depends on these two aspects of language development: how many words we know (vocabulary growth) and whether we know how to combine them to communicate meaningful ideas (semantic organization). The current research was designed to examine the role of early language environment in shaping these two aspects of language development. More specifically, it was designed to uncover whether early language development in individual children can be predicted from the statistical structure of their language input, i.e. the co-occurrence regularities with which words are used in daily child-parent interactions. By shedding light on the cues and the mechanism guiding early language acquisition, the proposed research promises to have important theoretical as well as practical implications, as it may support the development of theory-driven interventions for children who are on a slower developmental path.
MELA project aims to address two main objectives. First one is to examine whether early vocabulary growth can be predicted from the co-occurrence structure of children’s language input. Second is to test whether early semantic links in children’s mental lexicon can be predicted from the co-occurrence structure of their language input.
The MELA project tracks the early language development of bilingual Basque-Spanish children from the prelinguistic period till they are expected to produce first word combinations (9-18 months). During this time, for each individual child, language input and language output (vocabulary growth and semantic organization) are continually assessed. This makes it possible to test the power of available language acquisition methods to map the structure of individual parents’ speech to the individual language development of their children, beyond predictions which we can make about the general trends in early development. One of the important aspect of the current work is that it embraces the complexity of language development in children growing up bilingual. It further contributes to the literature by studying acquisition of languages which are underrepresented in the literature – Basque and Spanish. These challenges resulted in several important achievements that go in hand with MELA’s theoretical aims.
For MELA project, we first recruited 60 infants growing up in a bilingual (Basque-Spanish) environment. To support data collection we developed bilingual vocabulary assessment tool in collaboration with Basque and Spanish linguists. For each child, we assessed what words they are able to understand and say in both languages, at the ages of 11, 14 and 17 months. At the same time, we collected day-long recordings of the language children hear in their home environments at the ages 10, 13 and 16 months (32 hours per time point). Given that no corpus of child directed speech or tools for its analyses (e.g. transcription tools for Basque-Spanish mixed speech) existed for this population, we developed pipelines for speech transcription, data analyses and sharing in collaboration with natural language processing engineers and experts in language analyses. Finally, all children are tested in the lab at the ages 12, 15 and 18 months, when we collect information about a child’s language background, language learning activities, and, most importantly, child’s lexical processing and semantic processing abilities – that is, a child’s abilities to recognize (1) a visual referent of a spoken word, and (2) a semantic relationship between the visual referents and spoken words. At the time of submitting this report, 80% of data collection is completed (i.e. 25 families fully completed the study, 6 dropped before the completion, and 29 are still being tested with the last family scheduled to complete the study by July 2025).
The current project makes two important contributions to the broad language research community, beyond the contributions to specific research questions. First, by transcribing and sharing the transcripts of the recordings of early bilingual language input, the project will contribute a unique data set to the most influential resource on language input – CHILDES database of child directed speech (MacWhinney, 2000). Second, data on bilingual Basque-Spanish vocabulary development will contribute to the WordBank repository (Frank et al, 2016). Both of these contributions will provide still extremely rare data about early bilingual language development for the use by broad scientific community, following the highest standards of open science and participant privacy protection.
Although our data collection is still ongoing, there are several important preliminary findings that speak to our research aims. First, we find that the measures of language input that best predict early bilingual vocabulary development are not the measures of the mere amount of speech children hear at home, but the intensity of child-parent interactions (i.e. numbers of conversational turns). Further, we find that one-year-old bilinguals are not only able to map a word (label) to an object, but also that they are sensitive to semantic relations between the words and objects. These findings suggest that although still modest in size, vocabularies of one-year-old bilinguals are semantically organized, with words in their mental lexicon being connected based on the overlap in meaning. Critically, our experimental design provides evidence that word co-occurrence in the input (as measured based on available monolingual Spanish corpus) supports formation of early word links. Next step will be to test whether we can predict emergence of different types of semantic relations in lexico-semantic development of individual children from the word co-occurrence regularities present in their specific language experience. Finally, our analyses of the predictors of monolingual vocabulary development based on available data in Spanish and five more languages (corpus of child direct speech, MacWhinney, 2000) and age of acquisition (Frank et al, 2016)) demonstrate that in early development diversity of contexts in which words co-occur predicts the order in which words are learned, with words being earlier acquired if they occur in consistent contexts (Unger, Chang, Savic, Bergen, & Sloutsky, 2024). We will further test whether the same patterns are true for bilingual development and whether the same principles explain individual vocabulary development.
msca-final-report-osavic.png
My booklet 0 0