Periodic Reporting for period 4 - G-Statistics (Foundations of Geometric Statistics and Their Application in the Life Sciences)
Período documentado: 2023-03-01 hasta 2024-08-31
The goal of geometric statistics is to develop a rigorous statistical theory on manifolds and more generally on spaces with a geometric structure. This project aimed at strengthening its mathematical foundations and at exemplifying their impact on selected applications in the life sciences. We explored in G-statistics foundational methods to unify statistical estimation theories on Riemannian manifolds with other geometric structures like Lie groups, affine connection spaces, quotient and stratified spaces that naturally arise in applications. Beyond the mathematical theory, we aimed at providing generic but effective implementations of most of the geometric statistics methods that can use specific implementation of most of the geometric structure considered. We illustrate our methods in computational anatomy application with the study of anatomical shapes and the forecast of their evolution.
Beyond the mean, we investigated non-parametric submanifold learning techniques generalizing properly the principal flows to more than one dimension. The main obstruction is that the tangent space estimated with local PCA does not generates a submanifold but rather a non-integrable field of subspaces (a geometric distribution) that we call the Principal Bundle [32,37]. Despite the absence of a submanifold, we can still compute distances between the points of the underlying point-cloud that respect this geometry using the proper notion of sub-Riemannian geodesics. This method working in any manifold and any dimension / co-dimension achieves impressive results on very noisy point clouds on a 2D surface in 3D. This is a very promising technique for geometric processing in computer graphics and for data analysis in high dimensional spaces. We also developed a new theory of affine maps in manifolds which pave the way for the generalization of algorithms like Locally Linear Embedding (LLE) to Riemannian manifolds [31,49]. Finally, we revisited standard dimension reduction techniques such as probabilistic PCA with flag spaces: we showed that the resulting Principal Subspace Analysis provides a principled family of models which is much simpler and more interpretable than usual PCA modes, while remaining as efficient as other the state-of-the-art methods [51].
For symmetric positive definite (SPD) matrices, used in a wide range of applications, we clarified the relationship between existing metrics by classifying them in main families based on their invariance properties [1,3,27,29]. We then investigated the quotient space of full-rank correlation matrices. The most natural affine-quotient metric has both negative and (unbounded) positive curvature [18], which may notably complexify the implementation of the logarithm with optimization. Thus, we introduce computationally more convenient Hadamard or even log-Euclidean metrics, along with their geometric operations [28,45]. These new metrics may have very interesting applications in several areas, notably in neuroimaging where brain networks extracted from fMRI data are parametrized by correlation matrices.
We illustrated our methods in real-world computational anatomy applications with the statistical modeling of cardiac motion across subjects. The geodesic regression of the motion of the heart in a group of diffeomorphisms was parallel transported along the inter-subject deformation in order to perform groups statistics on all trajectories in the same reference anatomy. For the right ventricle under pressure or volume overload, decoupling the volume change from the deformation directly within the metric on diffeomorphism revealed statistical insights into the dynamics of each disease [16,23,25]. A similar methodology using Cartan-Schouten instead of a right-invariant metric connections on diffeomorphisms was applied to the assessments of treatment effects on longitudinal brain changes in the Multidomain Alzheimer Preventive Trial cohort [10].