Many biological systems can be seen as networks of interconnected units. Understanding synchronization is key to understand some biological networks. A long-standing problem in neuroscience is to recover the network structure in a coupled system, such as a neuronal network represented by extracellularly recorded spike trains or traces of EEG signals from different locations on the scalp. Statistical inference of such systems has only recently caught interest as technological advances allow for simultaneous recordings of many interacting units. Therefore, a solid statistical treatment to properly test hypotheses on the network structure has not yet been established.
To solve this problem, the project elaborated the methodology of cointegration analysis. The theory of cointegration has developed within the field of econometrics and offers a refined statistical toolbox to analyze non-stationary multidimensional time series. The key idea is to estimate the long-run equilibrium relationships between several variables, which are captured by cointegrating vectors. The cointegration analysis provides estimation of the number of cointegration relations and allows to identify the coupling strengths and directions of the couplings.
This project aimed to extend the standard cointegration method for the needs of analyzing biological oscillators and hence to provide a more principled way to infer functional structure of biological networks. We addressed three main gaps in the methodology:
1. The standard cointegration analysis was used earlier with success up to about 10 dimensions. Nevertheless, biological networks are often of a much higher dimension. In this project, we found ways how to apply this methodology to high-dimensional data.
2. Couplings in many biological networks are not linear or constant and the standard model of cointegration was not ready for that. We suggested a more complex model that mimics nonlinear effects.
3. Not all coordinates of the investigated systems can be observed directly and are thus imputed. This causes identifiability issues, when some aspects of the systems cannot be inferred. We resolved this problem for EEG data.