Across natural science, experimental techniques that generate large and inherently noisy datasets are currently being developed. As these techniques gain popularity, the potential impact of inference methods that enable useful information to be extracted from the resulting data is large. However, different experiments generate datasets that are afflicted by different types of noise, and require different analysis methods. In particular while the development of general theoretical tools is important, it is also necessary for researchers to take a multidisciplinary, intersectorial approach, and to work with both theoreticians and experimentalists to gain understanding of how the available theoretical tools can be applied to different datasets, so that relevant and useful information can be extracted from the data. This approach can give rise to the collection of new data, which can establish whether the theoretical approach is making accurate inferences and predictions from the existing data.
I collaborate with both theorists and experimentalists to extract useful information from large sets of protein sequence data. This approach led to the development of state of the art techniques that make inferences about the 3D structure and function of proteins from large multiple sequence alignments. My research will focus on understanding the scope and the accuracy of the information that we are able to infer from the data, and how this depends on the parameters of the data. Collaborations with experimentalists will help us understand how this information can be exploited to engineer changes in protein phenotypes. The experience and knowledge gained from this specific domain will inform my longer-term goal; to take this approach of close collaboration with both theorists and experimentalists in order to extract information from high dimensional, noisy datasets, and apply it more widely to important and outstanding questions across different areas of natural science.
Call for proposal
See other projects for this call