From the outside, AI or machine learning is the process of fitting a model to some data -- for example, fitting a generative model for language so that the text it produces is similar to training data found in documents across the internet. But what actually happens inside of the learning machine, the process of learning itself, is a numerical computation: The solution of a mathematical problem that has no analytic solution. In particular, for contemporary machine learning: optimization, to find "best fits"; simulation, to "predict what might happen next", and linear algebra, the solution of extremely large systems of linear equations, with millions or billions of unknown variables. Although machine learning is a young discipline, the algorithms used for these purposes are surprisingly old. They were invented in other disciplines to address structurally similar, but not completely identical tasks. This allowed AI and ML to move quickly without having to reinvent the wheel. But it also creates subtle problems, a mathematical version of the legacy-code problem known in IT: Because machine learning is not exactly like older tasks in applied mathematics, the old algorithms do not always work well; they require costly work-arounds to work in some settings; and they sometimes become so unstable that human users must be present to monitor them. This creates significant inefficiency in modern ML, both in terms of human, and technological resources.
The goal of Project PANAMA, in a nutshell, was to refurbish classic numerical methods to make them work natively, efficiently and effectively in contemporary machine learning. This was achieved by leveraging a deep conceptual insight, namely that mathematical computation itself is a form of learning, of inferring a latent (mathematical) quantity from observable (computable) numbers. Where machine learning uses empirical data, collected in the world and stored on disk, a numerical method uses computational data, collected by a chip and stored in RAM. Both processes can be phrased completely equivalently in the language of probabilistic inference. The resulting probabilistic numerical methods can then be embedded seamlessly within the wider machine learning pipeline.
This is useful because both data and compute are finite resources. A machine learning model that runs on a small, weakly informative data set does not require high-precision internal computations. So if the numerical algorithm "knows" about this, it can safe computational resources. But data and compute are also both sources of information. By treating them equally, they can be mixed flexibly: For example in scientific machine learning, computational information in the form of simulations, can be used to supplement missing empirical data, and empirical measurements can be used to identify unknown parameters in dynamical simulations.