Periodic Reporting for period 1 - MELA (Mechanisms of Early Language Acquisition)
Reporting period: 2023-01-09 to 2025-01-08
MELA project aims to address two main objectives. First one is to examine whether early vocabulary growth can be predicted from the co-occurrence structure of children’s language input. Second is to test whether early semantic links in children’s mental lexicon can be predicted from the co-occurrence structure of their language input.
For MELA project, we first recruited 60 infants growing up in a bilingual (Basque-Spanish) environment. To support data collection we developed bilingual vocabulary assessment tool in collaboration with Basque and Spanish linguists. For each child, we assessed what words they are able to understand and say in both languages, at the ages of 11, 14 and 17 months. At the same time, we collected day-long recordings of the language children hear in their home environments at the ages 10, 13 and 16 months (32 hours per time point). Given that no corpus of child directed speech or tools for its analyses (e.g. transcription tools for Basque-Spanish mixed speech) existed for this population, we developed pipelines for speech transcription, data analyses and sharing in collaboration with natural language processing engineers and experts in language analyses. Finally, all children are tested in the lab at the ages 12, 15 and 18 months, when we collect information about a child’s language background, language learning activities, and, most importantly, child’s lexical processing and semantic processing abilities – that is, a child’s abilities to recognize (1) a visual referent of a spoken word, and (2) a semantic relationship between the visual referents and spoken words. At the time of submitting this report, 80% of data collection is completed (i.e. 25 families fully completed the study, 6 dropped before the completion, and 29 are still being tested with the last family scheduled to complete the study by July 2025).
Although our data collection is still ongoing, there are several important preliminary findings that speak to our research aims. First, we find that the measures of language input that best predict early bilingual vocabulary development are not the measures of the mere amount of speech children hear at home, but the intensity of child-parent interactions (i.e. numbers of conversational turns). Further, we find that one-year-old bilinguals are not only able to map a word (label) to an object, but also that they are sensitive to semantic relations between the words and objects. These findings suggest that although still modest in size, vocabularies of one-year-old bilinguals are semantically organized, with words in their mental lexicon being connected based on the overlap in meaning. Critically, our experimental design provides evidence that word co-occurrence in the input (as measured based on available monolingual Spanish corpus) supports formation of early word links. Next step will be to test whether we can predict emergence of different types of semantic relations in lexico-semantic development of individual children from the word co-occurrence regularities present in their specific language experience. Finally, our analyses of the predictors of monolingual vocabulary development based on available data in Spanish and five more languages (corpus of child direct speech, MacWhinney, 2000) and age of acquisition (Frank et al, 2016)) demonstrate that in early development diversity of contexts in which words co-occur predicts the order in which words are learned, with words being earlier acquired if they occur in consistent contexts (Unger, Chang, Savic, Bergen, & Sloutsky, 2024). We will further test whether the same patterns are true for bilingual development and whether the same principles explain individual vocabulary development.