The current project makes two important contributions to the broad language research community, beyond the contributions to specific research questions. First, by transcribing and sharing the transcripts of the recordings of early bilingual language input, the project will contribute a unique data set to the most influential resource on language input – CHILDES database of child directed speech (MacWhinney, 2000). Second, data on bilingual Basque-Spanish vocabulary development will contribute to the WordBank repository (Frank et al, 2016). Both of these contributions will provide still extremely rare data about early bilingual language development for the use by broad scientific community, following the highest standards of open science and participant privacy protection.
Although our data collection is still ongoing, there are several important preliminary findings that speak to our research aims. First, we find that the measures of language input that best predict early bilingual vocabulary development are not the measures of the mere amount of speech children hear at home, but the intensity of child-parent interactions (i.e. numbers of conversational turns). Further, we find that one-year-old bilinguals are not only able to map a word (label) to an object, but also that they are sensitive to semantic relations between the words and objects. These findings suggest that although still modest in size, vocabularies of one-year-old bilinguals are semantically organized, with words in their mental lexicon being connected based on the overlap in meaning. Critically, our experimental design provides evidence that word co-occurrence in the input (as measured based on available monolingual Spanish corpus) supports formation of early word links. Next step will be to test whether we can predict emergence of different types of semantic relations in lexico-semantic development of individual children from the word co-occurrence regularities present in their specific language experience. Finally, our analyses of the predictors of monolingual vocabulary development based on available data in Spanish and five more languages (corpus of child direct speech, MacWhinney, 2000) and age of acquisition (Frank et al, 2016)) demonstrate that in early development diversity of contexts in which words co-occur predicts the order in which words are learned, with words being earlier acquired if they occur in consistent contexts (Unger, Chang, Savic, Bergen, & Sloutsky, 2024). We will further test whether the same patterns are true for bilingual development and whether the same principles explain individual vocabulary development.