Provable Scalability for high-dimensional Bayesian Learning

Description du projet

Améliorer l’extensibilité des méthodes d’apprentissage bayésien

Les algorithmes d’apprentissage bayésien, y compris ceux de Monte-Carlo par chaînes de Markov, sont largement utilisés dans différents cadres de modélisation tels que les modèles hiérarchiques et à haute dimension. Alors que l’échelle et la complexité des données disponibles augmentent, ces méthodes statistiques doivent être évolutives sur le plan informatique pour pouvoir être prises en compte. Le projet PrSc-HDBayLe, financé par le CER, relèvera ce défi en obtenant une large collection de résultats pour les algorithmes de calcul bayésien couramment utilisés, en mettant notamment l’accent sur les méthodes de Monte-Carlo par chaînes de Markov. Celles-ci seront appliquées à différents cadres de modélisation couramment utilisés pour les tâches statistiques. Les résultats permettront de concevoir de nouvelles approches informatiques plus évolutives et d’optimiser les approches existantes, en mettant la théorie en pratique dans le domaine du calcul bayésien.

Objectif

As the scale and complexity of available data increase, developing rigorous understanding of the computational properties of statistical procedures has become a key scientific priority of our century. In line with such priority, this project develops a mathematical theory of computational scalability for Bayesian learning methods, with a focus on extremely popular high-dimensional and hierarchical models.

Unlike most recent literature, we will integrate computational and statistical aspects in the analysis of Bayesian learning algorithms, providing novel insight into the interaction between commonly used model structures and fitting algorithms. Key methodological breakthroughs will include a novel connection between computational algorithms for hierarchical models and random walks on the associated graphical models; the use of statistical asymptotics to derive computational scalability statements; and novel understanding of the computational implications of model misspecification and data heterogeneity.

We will derive a broad collection of results for popular Bayesian computation algorithms, especially Markov chain Monte Carlo ones, in a variety of modeling frameworks, such as random-effect, shrinkage, hierarchical and nonparametric ones. These are routinely used for various statistical tasks, such as multilevel regression, factor analysis and variable selection in various disciplines ranging from political science to genomics. Our theoretical results will have direct implications on the design of novel and more scalable computational schemes, as well as on the optimization of existing ones. Focus will be given to develop algorithms with provably linear overall cost both in the number of datapoints and unknown parameters. The above contributions will dramatically reduce the gap between theory and practice in Bayesian computation and allow to fully benefit of the huge potential of the Bayesian paradigm.

Champ scientifique

Institution d’accueil

UNIVERSITA COMMERCIALE LUIGI BOCCONI

Contribution nette de l'UE

€ 1 488 673,00

Adresse

VIA SARFATTI 25
20136 Milano
Italie

Région

Nord-Ovest Lombardia Milano

Type d’activité

Higher or Secondary Education Establishments

Liens

Contacter l’organisation Site web

Participation aux programmes de R&I de l'UE

Réseau de collaboration HORIZON

Coût total

€ 1 488 673,00

Bénéficiaires (1)

UNIVERSITA COMMERCIALE LUIGI BOCCONI

Italie

Contribution nette de l'UE

€ 1 488 673,00

Description du projet

Améliorer l’extensibilité des méthodes d’apprentissage bayésien

Objectif

Champ scientifique

Programme(s)

Thème(s)

Appel à propositions

Régime de financement

Institution d’accueil

Bénéficiaires (1)

Partager cette page

Télécharger