Description du projet
Comprendre les familles de mots grâce à la modélisation informatique
Les mots qui partagent une origine commune dans une même langue ou à travers différentes langues sont appelés familles de mots. Grâce à la dynamique de l’utilisation de la langue, ces familles interagissent et évoluent, un fait qui reste largement ignoré dans les sciences du langage. Le projet ProduSemy, financé par l’UE, entend créer des modèles informatiques pour normaliser les données relatives aux familles de mots couvrant différentes langues. Ces modèles seront appliqués à des données issues de la linguistique historique, typologique et cognitive et permettront d’en apprendre davantage sur les nombreuses façons dont les familles de mots sont composées et structurées dans ces disciplines. De cette façon, le projet contribuera à l’intégration des méthodes et des données dans les domaines de la linguistique, des sciences cognitives et de la psychologie.
Objectif
All human languages have simple and complex words. Simple words refer to meanings regardless of their form, while complex words are formed from other words, and their formation can be semantically motivated. Since words can share lexical material, we can group them into families. Word families can vary greatly in size, ranging from small ones – comprising only a few members –, to large ones – spanning several hundred words –, but it is still unclear why some words are more productive than others in forming new words. Lexical compositionality has received some attention in historical linguistics, linguistic typology, and cognitive linguistics, but so far studies have mostly concentrated on the morphological complexity of individual words and languages, while the fact that words form families which interact during language change and language use has been typically ignored. As a result, many questions regarding word family formation remain unresolved, and we do not know 1) how word families evolve along language phylogenies, 2) which semantic processes underlying word family formation are universal, and 3) to what extent human cognition influences the productivity of lexical roots to form families. The project will tackle these three target questions by unifying evolutionary, typological, and cognitive insights into lexical compositionality. Building on a computer-assisted framework that reconciles classical and computational approaches in historical linguistics and linguistic typology, the project will design new models to standardize cross-linguistic data on word families, apply them to integrate data from historical linguistics, linguistic typology, and cognitive linguistics, and develop new methods for the computer-assisted inference of word families, their underlying motivation patterns, and their evolutionary histories in large datasets. In this way, the project will deepen the integration of cross-linguistic studies in cognitive and psychological sciences.
Champ scientifique
Programme(s)
- HORIZON.1.1 - European Research Council (ERC) Main Programme
Régime de financement
ERC - Support for frontier research (ERC)Institution d’accueil
94032 Passau
Allemagne