Skip to main content

Statistics for Complex Data: Understanding Randomness, Geometry and Complexity with a view Towards Biophysics

Final Report Summary - COMPLEXDATA (Statistics for Complex Data: Understanding Randomness, Geometry and Complexity with a view Towards Biophysics)

What is data? Most of us think of data as neatly ordered arrays of numbers, but to a mathematical statistician, any object that can be described mathematically can be considered as a datum: images, shapes, sounds and fluctuations, from the firing patterns of a neuron to the vibrations of proteins in solution. And as technology has surged in recent years, so has our ability to measure, record, store, manipulate and analyse ever more complex such data structures. Data come now in a bewildering variety of forms that require novel statistical tools to be analysed. Surprisingly, many standard methods may fail for such complex data, or may require substantial modification in order to be applied. The premise of this project was the development of novel statistical theory and methods to tackle the challenges presented in the analysis of complex data. The data motivating the mathematical work were derived from cutting-edge research problems in biophysics and molecular biology. Such data pose intricate challenges related to their geometry (the structure of the mathematical formalism needed to describe them), as well as their complexity (the fact that the dimensionality of these data is very high). The output of the project was the development of the mathematical framework, as well as the main methods and theory that go along with it to analyse several representative types of such data: random tomography (arising in structural biology), functional flows (arising in molecular biophysics), manifold data (arising in brain imaging), and point pattern data (arising in neuroscience).