Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Fast Monte Carlo integration with repulsive processes

Periodic Reporting for period 3 - BLACKJACK (Fast Monte Carlo integration with repulsive processes)

Reporting period: 2023-02-01 to 2024-07-31

Astrophysicists design complex models of the evolution of galaxies, biologists develop intricate models of cells, ecologists model the dynamics of ecosystems at a world scale. The work of the statistician is to take such a model, along with relevant data, and turn the pair into decisions and knowledge. Given a model for a cardiac cell and the observation of the electric activity around stem cells in the lab, how do we know whether an anti-arrythmic drug is efficient? Given an epidemiological model for Covid and the data provided by hospitals, should a political decision-maker close schools?

Making such decisions requires first to use data to guess the free parameters in a physical model. One standard way of formalizing this problem in risk-sensitive areas like medicine is called "Bayesian inference". The standard family of algorithms to perform Bayesian inference is in turn Monte Carlo methods. Unfortunately, their application requires millions of evaluations of the model under scrutiny, one after the other. For complex biological systems, such an evaluation can last a few minutes. A million minutes is close to two years. For most problems, it is not realistic to wait for two years for an intermediate task of the overall decision pipeline. In Blackjack, our goal is turn down these two years to a few weeks, or two weeks to a few hours. If successful, our algorithms would thus allow to use more complex models when making important scientific decisions, such as characterizing the dangerosity of drugs, or quantifying a key physical parameter of a model of the universe.

More concretely, Monte Carlo methods are based on evaluating many random values of the parameters of a model. While randomness is useful when the number of parameters is large, there is randomness and randomness. We claim that it is possible to impose regularity on random patterns of points, and that this regularity can be used to achieve faster Monte Carlo algorithms. Because regularity is to be interpreted as "spreading the random values of the parameters as uniformly as possible across physically possible values, we speak of "negative dependence", or "repulsive point processes": two random paramater sets should stay at long distance from each other. In Blackjack, we study the repulsive point processes that accelerate Monte Carlo methods. We study their theoretical properties ("How well does this point process solve my statistical problem?") as well as algorithms to put these point processes to work ("How fast can I solve my problem on a single computer? On a supercomputer?").
We have mathematically proved that a class of negatively dependent point processes can reach the best performance possible in a large class of inference problems. This is a strong motivation for our project of fast Monte Carlo methods, and in particular to look for efficient algorithms. We have identified fast algorithms for a small class of point processes, and we are currently working on more widely applicable algorithms. Meanwhile, we have shown that, beyond our original motivation, the principle of randomly-but-uniformly spreading points in an abstract space can solve many more problems. For instance, it is implicit in many signal processing pipelines, the algorithms behind your phone communications or to producing the music you listen to. We have studied one way of denoising noise-corrupted signals (think of an old live audio recording of a concert) that relies precisely on the negative dependence between abstract points that live on a sphere and perfectly encode the signal.
We have already extended the state of the art with our theoretical results on Monte Carlo integration. By the end of the project, our main goal is to have the efficient algorithms that realize the potential of our theoretical results. As with any ambitious scientific project, there is also considerable knowledge acquired as "side results". These are results that were initially less or not even expected, but turned out to generate ideas and momentum, so as to make them a research line of their own. This is the case, for instance, of our results on signal processing with abstract points on a sphere. Similarly, we have made many connections between our negatively dependent points and physical systems in quantum optics. This has raised many question on both sides, computer science and physics, and we are confident that we will have interesting interdisciplinary results by the end of the project.
dpp.jpg
My booklet 0 0