Provable Scalability for high-dimensional Bayesian Learning

Información del proyecto

PrSc-HDBayLe

Identificador del acuerdo de subvención: 101076564

DOI

10.3030/101076564

Fecha de la firma de la CE 1 Diciembre 2022

Fecha de inicio 1 Mayo 2023

Fecha de finalización 30 Abril 2028

Financiado con arreglo a

European Research Council (ERC)

Coste total

€ 1 488 673,00

Aportación de la UE

€ 1 488 673,00

1 488 673,00

Coordinado por

UNIVERSITA COMMERCIALE LUIGI BOCCONI
Italy

Periodic Reporting for period 1 - PrSc-HDBayLe (Provable Scalability for high-dimensional Bayesian Learning)

Período documentado: 2023-05-01 hasta 2025-10-31

As the scale and complexity of available data increase, developing rigorous understanding of the computational properties of statistical procedures has become a key scientific priority of our century. In line with such priority, this project is developing a mathematical theory of computational scalability for Bayesian learning methods, with a focus on extremely popular high-dimensional and hierarchical models. Unlike most research on these topics, the project integrates computational and statistical aspects in the analysis of Bayesian learning algorithms, providing novel insight into the interaction between commonly used model structures and fitting algorithms. In particular, the project derives a broad collection of results for popular Bayesian computation algorithms, especially Markov chain Monte Carlo ones, in a variety of modeling frameworks, such as random-effect, shrinkage, hierarchical and nonparametric ones. These are routinely used for various statistical tasks, such as multilevel regression, factor analysis and variable selection in various disciplines ranging from political science to genomics.

Throughout the project, the research team involved in the project is tackling several open problems in Bayesian computation. In the first scientific theme, we develop a rigorous mathematical theory of the computational behavior of Bayesian hierarchical models, analyzing algorithmic performance in sparse, structured settings and advancing convergence diagnostics via dimensionality reduction techniques for MCMC algorithms. In the second theme, we focus on the computational cost of high-dimensional Bayesian inference, linking Bayesian asymptotics with Markov chain theory to design provably scalable algorithms. Finally, we study robustness of commonly used algorithms to data heterogeneity, model misspecification, and tail behavior, leading to the development of more stable and generalizable computational techniques.

The results have direct implications on the design of novel and more scalable computational schemes, as well as on the optimization of existing ones. Focus is given to develop algorithms with provably linear overall cost both in the number of datapoints and unknown parameters. The project contributes to significantly reduce the gap between theory and practice in Bayesian computation and allow practitioners to fully benefit of the huge potential of the Bayesian and probabilistic modeling in popular data science pipelines.

The PrSc-HDBayLe project has already generated various innovative computational methodologies, scientific advancements and interdisciplinary research developments. This is witnessed by the high volume and quality of scientific outputs of the project in the form of scientific papers published in top-tier scientific journal in Statistics and related areas. Examples of the scientific contributions include:
- New interdisciplinary links between Bayesian asymptotics and MCMC complexity theory, under random data-generating assumptions. This allows to simplify and enable novel complexity results, for example in the context of large hierarchical models and latent variable models.
- Connections between Bayesian computation and random graph theory, used to analyze coordinate-wise algorithms for hierarchical models and answer fundamental questions regarding computational aspects of Bayesian hierarchical models, such as: how do sampling and optimization coordinate-wise algorithms perform on for increasingly larger and sparser models? How does the observation pattern affect convergence speed? Which algorithms perform well on average-case random designs? These results offer practical insights into the design and optimization of Gibbs Samplers, optimization methods based on backfitting, and Coordinate Ascent Variational Inference methods, especially for crossed and nested models.
- The first fully explicit, non-asymptotic analysis of the convergence rate of the popular and classical Gibbs sampler algorithm under log-concavity, in the form of an entropy contraction results, with interesting and unexpected implications for the complexity theory of log-concave sampling methods.
- Novel results on zero-order parallel sampling methods, including results on the fundamental limitations of multiproposal MCMC methods, showing that (under appropriate assumptions) they can only achieve a logarithmic speed-up in the number of parallel workers; and the development of novel methodologies based on parallel-in-time integrators that provably achieve polynomial speed-ups.

The project has produces various mathematical, statistical, computational and algorithmic results beyond the state of the art (as shown by the list of publications). Two examples include:
- We developed various novel mathematical techniques, or improved existing ones, in order to analyze and improve Bayesian computational algorithms. For example, we introduced a new comparison techniques for Markov chains, overcoming limitations of classical methods like Peskun ordering. Our approach, based on “conditional capacitance,” enables rigorous comparison of hybrid samplers (e.g. Metropolis-within-Gibbs) with ideal counterparts, providing precise assessments of performance trade-offs.
- We developed scalable Bayesian computation tools, such as: conjugate gradient samplers for generalized linear mixed models (GLMMs); novel partially-factorized variational inference algorithms for improved uncertainty quantification; novel algorithms for zeroth-order parallel sampling; and mixture importance sampling estimators for Bayesian cross-validation criteria; and others. The corresponding codes are freely available online (e.g. through Github repositories, R software packages, etc).

Periodic Reporting for period 1 - PrSc-HDBayLe (Provable Scalability for high-dimensional Bayesian Learning)

Compartir esta página Compartir esta página en las redes sociales

Descargar Descargar el contenido de la página