Skip to main content
European Commission logo print header

Statistical physics of immune-viral co-evolution

Periodic Reporting for period 3 - STRUGGLE (Statistical physics of immune-viral co-evolution)

Período documentado: 2020-11-01 hasta 2022-04-30

The immune system within each individual host destroys viruses, which manage to escape immunity on the global scale. Recent experiments show population-level responses of both immune repertoires and viruses, and a history dependence of their functional phenotypes. This constrained long-term co-evolution of immune receptor and viral populations is a stochastic many-body problem occurring at many scales, in which the response emerges based on the past states of both the repertoire and viral populations. STRUGGLE aims to quantify the effects of viral-immune receptor interactions from functional datasets to obtain a statistical model of co-evolution between immune repertoires and viruses.

STRUGGLE covers the many scales of immune-virus interactions: from the molecular level, analyzing high-throughput mutational screens of libraries of antibodies binding a given antigen, through the population-level response of immune repertoires, analyzing next-generation sequencing of vaccine- stimulated whole repertoires, to the population level, modeling the long-term co-evolution of both repertoires and viruses.

STRUGGLE combines a statistical data analysis approach with cross-scale many-body physics to:
- build a molecular model for antigen-receptor binding;
- learn statistical models for repertoire-level response to viral antigen stimulation;
- validate dynamical models of interactions between antigen and immune receptors;
- theoretically evaluate the predictive power of the immune system and viruses;
- and predict virus strains and immune responses based on past infections.

The outcomes of STRUGGLE include the quantitative characterization of the human T-cell response to yellow fever vaccine and the trout B-cell response to life-threatening rhabdoviruses, which aids vaccine design for fish, with wide use in agriculture. We quantify the notion of public and private responses. We identify responding clonotypes, propose sequence logos that are free of generation bias. We characterize evolution trajectories for viruses in phenotypic space. The statistical properties of the co-evolutionary process are needed for informed development of immunotherapies.
The STRUGGLE project is aimed at describing the co-evolution of the cells of the adaptive immune system with external factors such viral pathogens. In this period, we have developed three methods that describe the response of T-cells to different kinds of perturbations: strong acute infections such as yellow fever using time dependent immune repertoire data, auto-immune related conditions (diabetes, CMV) from population level data, and acute or individual conditions from single time-point blood samples using sequence similarity. This last point is especially interesting, since it shows that if the immune repertoire data is correctly corrected for generation and selection biases, sequence similarity alone is enough to identify responding cells. We have also developed a method for identifying selection pressure on B-cells coming from HIV - immune system co-evolution directly from data, showing that adaptation of the immune system from the virus is slowed down. We also identified the diversity and overlap of response in B-cell repertoires to viruses in trout, and T-cell repertoires in humans based on available data. Using previously obtained antigen-antibody binding landscape data we identified signs of beneficial epistasis, and tried to include extinction into evolving populations. We developed efficient algorithms that are necessary for identifying response and shared clonotypes between individuals. Inspired by flu evolution, we build theoretical models of evolving viruses in a population of host immune systems. We identified parameter regions that lead to modes of evolution with one or many co-existing strains, similarly to the different situations observed in data. Lastly, we used information theory to describe internal representations for viral evolution.
We proposed different methodologies for identifying responding immune cells, in different settings. One set of methods, applied to time dependent data, uses controls noise in repertoire sequencing data, in a way that is specific to this kind of datasets. We have also proposed an efficient algorithm for identifying response in single time-point samples, without the need to cohort level analysis. The third exploits baseline expectations to make population level statements. In all of these methods describing the baseline expectancy, as well as expected selection pressures in the repertoire proves important. For this reason, we have put a lot of effort into efficient software packages and algorithms (ALICE, OLGA, SONIA, SOS) and characterising population level differences. We also developed efficient algorithms that identify shared clonotypes between individuals in human T-cells and B-cells.

In another direction we developed a Restricted Boltzman Machine approach to epitope presentation. Our method combines information from specific experiments and databases, and unlike traditional methods can be used to personalised datasets with very rare HLA alleles. Our work on antigen-antibody models develops a carefully controlled null model for noninteracting mutations, allowing us to reliably identify epistasis.  We find that epistatically interacting sites contribute substantially to binding. In addition to negative epistasis, we report a large amount of beneficial epistasis, enlarging the space of high-affinity antibodies as well as their mutational accessibility. 

Realising that inferring collective dynamical models from data is hard, especially if the dynamics is not first order, in more theoretical work, we explored this angle and developed new inference methodology for collective dynamics. We developed theoretical models for viral evolution in a population of host immune systems and used information theoretic approaches to quantify prediction for viral evolution.

We will continue to work towards the goals of the project.
The idea behind the ALICE algorithm: Identifying overrepresented TCR clusters