Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Automated evaluation and correction of generation bias in immune receptor repertoires

Periodic Reporting for period 1 - PGEN (Automated evaluation and correction of generation bias in immune receptor repertoires)

Reporting period: 2019-03-01 to 2020-11-30

Personalized medicine is promising to change the health and prophylactics sectors, already showing large societal and economic impact. High throughput immune sequencing technologies give us insight into a unique personal medical record - our immune system’s memory of all encountered infections and the successful solutions our body found. Unfortunately, reading this record requires advanced statistical tools making it often inaccessible to biologists and medical practitioners. PGEN offers a solution that will fill this gap between modern technology and applications by developing commercialisation solutions for an immune repertoire analysis software. PGEN builds on the methodologies developed during the ERC Starting Grant RECOGNIZE to provide a user-friendly versatile software prototype that will give researchers in the life scientists and medical practitioners without bioinformatic training an overview of the statistical properties of the immune repertoires they are studying. PGEN allows medical practitioners to characterize the baseline immune repertoire of a healthy person. Knowing this baseline is essential for identifying immune system disfunction and applying immune repertoire sequencing technologies in biotechnology and medicine. The PGEN software platform provides doctors and biologists with a statistical tool that will help them make informed decisions about treatment, prophylactics and future research.

ERC Recognize led to algorithmic advances in modelling VDJ recombination and subsequent selection of T and B cell receptors that provide useful tools to analyze and compare immune repertoires across time, individuals, and tissues. The goal of PGEN was to make a proof of concept based on the ERC Recognize algorithm and develop a general software that can be used by physicians and biologists without bioinformatic training to analyze immune repertoire data and provide them with statistics that will help them make informed decisions. The outcomes of PGEN are two main software packages: SOS — a web-based interface where users with no coding skills can compute the generation and post-selection probabilities of their sequences, as well as generate batches of synthetic sequences; and pygor a more advanced python package and suite of command-line tools (easily installable in a single command through the Python Package Index system) for evaluating the generation probability of large datasets, calculating new models for new datasets and species, as well as a suite of plotting and analysis commands. Pygor3 provides a python interface to execute and encapsulate V(D)J recombination IGoR input/outputs by using a sqlite3 database that contains input sequences, alignments, model parameters, conditional probabilities of the model Bayes network, best scenarios and generation probabilities in a single database file. Pygor3 also has command line utilities to import/export IGoR generated files to AIRR standard format. Pygor comes complete with a set of tutorials and examples that guide the user from relatively simple tasks such as plotting existing models, evaluating these models for their sequences, to learning a whole new model for new species, with a tool to automatically download species genomic data from the IMGT database. Pygor also ships with pre-existing models for alpha and beta chains of human and mice T-cell receptors and light and heavy chains for human and mice B-cell receptors. SOS can be used on a mobile phone as an app, a feature that was requested by laboratory users who need to know quickly the generation probability of one sequence.
My booklet 0 0