A computational approach to early language bootstrapping

Información del proyecto

BOOTPHON

Identificador del acuerdo de subvención: 295810

Proyecto cerrado

Fecha de inicio 1 Noviembre 2012

Fecha de finalización 31 Octubre 2017

Financiado con arreglo a

Specific programme: "Ideas" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)

Coste total

€ 2 194 557,48

Aportación de la UE

€ 2 194 557,00

2 194 557,00

0,48

Coordinado por

ECOLE DES HAUTES ETUDES EN SCIENCES SOCIALES
France

Final Report Summary - BOOTPHON (A computational approach to early language bootstrapping)

One of the most fascinating facts about human infants concerns the speed at which they acquire their native(s) language(s). During the first year alone, that it, before they start talking, infants achieve impressive landmarks regarding two key language components: First, they tune into the phonetic categories (consonants and vowels) of their language. For instance, they lose the ability to distinguish some fine phonetic contrasts that belong to the same category, enhance their ability to distinguish some between-category contrasts, and refine their ability to ignore acoustic variations due to speaker characteristics. Second, infants learn to segment the speech stream, from large utterance-like prosodic units to smaller word-like ones, and start to learn some of the most frequent words in their language.
The ERC Bootphon has found new evidence that these two achievements are not independent but interrelated. Indeed, it has shown using machine learning techniques run on speech corpora, that consonants and vowels cannot be directly extracted from the speech signals. Instead, it is important to know at least some words to help distinguishing the important versus unimportant contrasts. Yet, the ERC team has also found that is very difficult to learn the words without knowing which are the consonants and vowels of the language. The way it proposes to solve this chicken-and-egg problem is through the join and gradual learning of approximations of consonants and vowels (proto-phonemes) and words (proto words). The algorithms proposed and analyzed by the team centers on the discovery of linguistic units from raw speech, which open up applications in the documentation of endangered languages and the construction of automatic speech recognition tools for languages without stable orthographies.
The ERC has also demonstrated that infant's input is much more noisy than thought previously. Indeed, it was thought that parents who adopt 'baby talk', which is a particular, exaggerated way of talking to infants, were simplifying the work for their infants. The ERC Bootphon, in collaboration with several teams who record infant directed inputs in their homes, showed that baby talk is actually more confusing to a naive learner, because it is much more variable than adult directed speech. The ERC Bootphon constructed a set of instruments enabling the quantitative measure of the language input available to infant learner, which is now deployed in a large international collaboration to study how language directed to infants affects language learning across cultures.

Final Report Summary - BOOTPHON (A computational approach to early language bootstrapping)

Compartir esta página Compartir esta página en las redes sociales

Descargar el PDF Descargar el contenido de la página