Skip to main content
European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Provable Scalability for high-dimensional Bayesian Learning

Descripción del proyecto

Mejora de la escalabilidad de los métodos de aprendizaje bayesiano

Los algoritmos de aprendizaje bayesiano computacional, incluidos los de Monte Carlo basados en cadenas de Markov, se utilizan de forma generalizada en diversos marcos de modelización, como los modelos jerárquicos y de alta dimensión. A medida que aumentan la escala y la complejidad de los datos disponibles, estos métodos estadísticos deben ser escalables computacionalmente para adaptarse a ellos. El equipo del proyecto PrSc-HDBayLe, financiado por el Consejo Europeo de Investigación, abordará este reto al derivar una amplia colección de resultados para algoritmos de cálculo bayesiano de uso común, con especial atención a los métodos Monte Carlo basados en cadenas de Markov. Se aplicarán a diversos marcos de modelización utilizados de forma habitual para tareas estadísticas. Los resultados permitirán diseñar nuevos métodos computacionales de mayor escalabilidad y optimizar los existentes, poniendo así en práctica la teoría en lo que se refiere a la computación bayesiana.

Objetivo

As the scale and complexity of available data increase, developing rigorous understanding of the computational properties of statistical procedures has become a key scientific priority of our century. In line with such priority, this project develops a mathematical theory of computational scalability for Bayesian learning methods, with a focus on extremely popular high-dimensional and hierarchical models.

Unlike most recent literature, we will integrate computational and statistical aspects in the analysis of Bayesian learning algorithms, providing novel insight into the interaction between commonly used model structures and fitting algorithms. Key methodological breakthroughs will include a novel connection between computational algorithms for hierarchical models and random walks on the associated graphical models; the use of statistical asymptotics to derive computational scalability statements; and novel understanding of the computational implications of model misspecification and data heterogeneity.

We will derive a broad collection of results for popular Bayesian computation algorithms, especially Markov chain Monte Carlo ones, in a variety of modeling frameworks, such as random-effect, shrinkage, hierarchical and nonparametric ones. These are routinely used for various statistical tasks, such as multilevel regression, factor analysis and variable selection in various disciplines ranging from political science to genomics. Our theoretical results will have direct implications on the design of novel and more scalable computational schemes, as well as on the optimization of existing ones. Focus will be given to develop algorithms with provably linear overall cost both in the number of datapoints and unknown parameters. The above contributions will dramatically reduce the gap between theory and practice in Bayesian computation and allow to fully benefit of the huge potential of the Bayesian paradigm.

Institución de acogida

UNIVERSITA COMMERCIALE LUIGI BOCCONI
Aportación neta de la UEn
€ 1 488 673,00
Dirección
VIA SARFATTI 25
20136 Milano
Italia

Ver en el mapa

Región
Nord-Ovest Lombardia Milano
Tipo de actividad
Higher or Secondary Education Establishments
Enlaces
Coste total
€ 1 488 673,00

Beneficiarios (1)