CORDIS - Forschungsergebnisse der EU
CORDIS
Inhalt archiviert am 2024-06-18

Systems biology of Mycobacterium tuberculosis

Final Report Summary - SYSTEMTB (Systems biology of Mycobacterium tuberculosis)

Executive Summary:
The SysteMtb project brought together experts from several scientific disciplines to explore and understand one of the most devastating human bacterial pathogens Mycobacterium tuberculosis (Mtb), which causes tuberculosis in humans. For the project, the consortium produced bacterial cultures, generated infection experiments, isolated RNA from Mtb cultures, did infection experiments in human cell lines and from human lung resections of TB patients, prepared samples for proteomics and metabolomics and performed microarray experiments as well as next-generation sequencing (NGS) experiments. In this context we first investigated the pathogen by applying cutting edge technologies of microbiology, functional genomics, proteomics and metabolomics. Next, we aimed to understand how the tubercle bacillus survives inside a human host cell, specifically in human macrophages, such as THP-1 cells. Finally, we applied state-of-the-art systems biology technologies, comprising microarray analyses and NGS by dual RNA-Seq that was refined and improved along the way, to characterize Mtb and the host response in THP-1 cells and during real infection in human lung resections. The wealth of data generated was subsequently integrated into theoretical models by computational biologists to describe the complex interactions of Mtb with its human host and different culture conditions mimicking host infection by computational tools. Tuberculosis bacteria respond to the type of environment that they experience during human infection and can be characterised in this way. One important finding from this project is that the conventional approach of transcriptional profiling provides only a partial, and sometimes misleading, perspective. The consortium developed the core technology to quantitatively assess mycobacterial metabolism as a network of interacting elements. In particular, methods were established that enable to assess concentration changes of hundreds of intracellular metabolites and to track the movement of metabolic flux within cells. Furthermore, computational methods were developed for data integration and interpretation, enabling researchers to generate testable hypotheses on the intracellular operation of mycobacteria. These technological advances have placed the SysteMTb consortium into a unique position to address open fundamental questions related to the unique survival strategy of mycobacteria in the human host. The SysteMTb project brought up the experimental resources and computational strategies and tools that are required to comprehensively and accurately quantify essentially all expressed proteins in Mtb and applied them to obtain unprecedented quantitative proteomic data sets of Mtb in important disease-associated states. Moreover, we generated a realistic picture of the adaptation of the cell envelope to various stresses. Information about Mycobacterium tuberculosis subcellular protein localization has been generated by SysteMTb for protein function prediction and identification of suitable drug/vaccine/diagnostic targets. Our new methodology helped to obtain protein localization data which strengthened the modelling of the Mtb cell cycle, allowing a high-throughput / high-content approach. To support the storage and sharing of the large amount of data generated in the consortium, a data storage platform was created including the raw experimental data of transcriptome, proteome, metabolome and lipidome experiments as well as tables with consistent descriptions of individual experiments. Through the same platform a genome browser was made available where expression data could be visualized with the genome as coordinate system. It also comprises an interactive visualization platform for protein-protein interactions. One of the major modelling achievements was the elaboration of a dynamic model of Mtb's central carbon metabolism which allows impressive insight, e.g. into cell survival rates after resuscitation and re-routing of carbon fluxes upon re-aeration. Our work enables all interested scientist to explore the potential of new treatment strategies to target and exploit latent tuberculosis. Such new strategies can involve different mechanisms we uncovered, from preventing successful resuscitation to pushing dormant Mtb into resuscitation under conditions that the bacteria cannot survive. One possible goal could be to force resuscitation during drug treatment to effectively shorten treatment time.
Project Context and Objectives:
Tuberculosis (TB) is an old but re-emerging global health threat caused by mycobacteria belonging to the Mycobacterium tuberculosis (Mtb) complex. One third of the world's population is infected with Mtb. Less than 5% of individuals develop active disease within 1-2 years after infection. In the remaining 95%, it is thought that Mtb persists in the face of an active immune response in a metabolically highly reduced stage of dormancy (latent infection), where it rarely replicates if at all. At later times, about one in ten of the latent infections will eventually progress to active disease, which, if left untreated, kills more than half of its victims. Such prolonged persistent interactions between the host and the pathogen present a major challenge in disease control. Even more frightening is the rapid emergence of Multi-Drug-Resistant (MDR) and Extensively-Drug-Resistant (XDR) strains along with the dangerous liaison between the Human Immunodeficiency Virus (HIV) and Mtb. In Sub-Saharan Africa, Mtb is the number one killer of HIV-infected individuals. Mtb infection is best described as an equilibrium involving a balance of activation and suppression of host responses, orchestrated by a complex and dynamic series of interactions between multiple host and bacterial components. Simple reductionist approaches are insufficient to understand this complex biology. That is why SysteMTb followed a systems biology approach intertwining experimentally-driven model development and model-driven experimentation for providing improvement in understanding this threatening bacterium supporting the design of rational strategies for preventive and therapeutic interventions. This entailed the progressive elucidation and thorough analysis of the structural scaffolds and cellular wiring of Mtb through the measurement of its global responses (metabolomics, lipidomics, proteomics, transcriptomics, glycomics) upon perturbations (chemical challenges, knock-out/knock-down mutants, etc.) which are relevant for infection. These global analyses required the smooth integration of all existing and newly generated data in a mathematically tractable form so that detailed models could be built; in turn, the models enabled iterative generation of novel hypotheses, direct experimental design and ultimately provide insights that can be possibly translated into novel intervention strategies. The grand goal of the SysteMTb project was to establish a Systems Biology framework to understand key features of Mtb and its interactions with the host which in turn provided new insights and a solid (model based) knowledge for the development of novel and cost-effective strategies to combat tuberculosis (TB). It brought together experts in Mtb and systems biology, and provided a centralized resource for TB researchers. The experimental programme of SysteMTb focused on two members of the Mtb complex: the pathogenic strain Mtb H37Rv (H37Rv) and the attenuated vaccine strain M. bovis BCG (BCG). H37Rv is widely used for laboratory studies. Work with BCG was carried out in laboratories that lack containment facilities required for the pathogenic strain. The workflow had been structured along three main objectives. The first objective was to generate and integrate quantitative data sets of Mtb (transcriptomics, proteomics, metabolomics, structural genomics, lipidomics, glycomics). This was done under different environmental perturbations. Data generated were provided first to the partners in the project and later to the community as a central resource for Mtb researchers. The second objective was the development of computer models at different appropriate levels of system complexity. SysteMTb specifically focused on models of metabolic fluxes and dynamic models that integrate transcriptomics and proteomics with metabolism. The third objective was to identify new possible targets for therapeutic intervention based on a Systems Biology analysis of the bacterium. Data from experimental analysis were used to develop models which were refined by testing predictions with new experimental analysis.
Project Results:
Tuberculosis is a deadly disease that requires a lengthy, expensive and side-effect laden treatment and is prevalent in up to 1/3 of the world’s population. The main reason for its wide spread and the lengthy treatment is latent tuberculosis, caused by the causative agent (Mtb) ability to enter a dormant state and resuscitate under favourable conditions. Latent tuberculosis can turn into active disease decades after a patient was symptom free. Nearly all research into dormancy has been dedicated to investigating the transition into dormancy. Here, we have carried out a large experiment to investigate the opposite, the dynamics of resuscitation. In order to allow for a systematic understanding of the resuscitation process and for concept building formalized in testable computational models, we have combined different high throughput experiments over the time course of bacterial resuscitation. Because of the systems biology approach taken and the interaction between biologists and theoretical scientists, this has resulted in a truly integrative dataset in which the different experiments are consistently normalized and can be compared to each other. We have used this integrative dataset to determine mechanisms that enable resuscitation of Mtb. and tested these mechanisms in mathematical models (see below). Interestingly, it seems that the majority of bacteria die at the onset of resuscitation. We could not fully determine the mechanism behind this, but our experiments indicate a role of oxygen radicals and their detoxification. Due to time constraints and the immense wealth of data collected we could not fully exploit the dataset and uncover every effect that might be hidden in the data. We did, from the very beginning, pay great attention to produce an integrated dataset that can be reused by the scientific community. This is not only an opportunity to re-evaluate our data but we have also produced the first such integrative dataset that combines information from mRNA, protein, lipid, and metabolite levels and integrates it with cell population size. Such a dataset is a requirement for quantitative modelling and the existence of such a dataset will greatly enhance the rate at which we learn to understand and combat Mtb. Our work hence enables any interested scientist to explore the potential of new treatment strategies to target and exploit latent tuberculosis. Such new strategies can involve different mechanisms we uncovered, from preventing successful resuscitation to pushing dormant Mtb into resuscitation under conditions that the bacteria cannot survive. One possible goal could be to force resuscitation during drug treatment to effectively shorten treatment time.

Structural Biology of Mtb proteins
Both multi-drug resistant and extensively drug-resistant M. tuberculosis strains have arisen, which are resistant to many of the front-line drugs currently in use. Therefore there is a clear need for the development of new drugs and the identification of new drug targets. Structural biology plays a clear role in this endeavour. Understanding the function of Mtb proteins, and indeed also that of existing protein drug targets, often requires detailed knowledge of their structure. The lipid metabolizing proteins are considered as good potential drug targets and anti-mycobacterial compounds have successfully targeted proteins involved in lipid metabolism, notably the front-line drug isoniazid. SysteMTb elucidated the complex interaction between protein complexes at the molecular level and supplied scientifically relevant information on a fundamentally important mechanism in Mtb. An integrated structural, biochemical and genetics approach was applied in order to reveal the physiological relevance of these complexes. In addition to advancing our understanding of Mtb, it created a unique resource to gain insights into possible new targets for therapeutic intervention. In addition, the results obtained are and will be of great impact on the TB research in Europe since it tackles a unique mechanism using an innovative multidisciplinary approach with the prospect of reaching a possible intervention measure in our war against TB. The SysteMTb consortium provided a detailed picture of the molecular architecture of M. tuberculosis ZmpI which is an important protein in the complex mechanism the bacterium developed to avoid its clearance by the human macrophages. For this enzyme we did not only describe the three dimensional structure but, guided by such an information, we were also able to design potent and specific enzyme inhibitors that will be essential to pharmacologically validate the enzyme as a potential novel drug target. After resuscitation of the bacterium from the dormant state, the striking observation was made that its genome remained intact and almost identical to that featuring the bacterium at the time of the primary infection event. Since M. tuberculosis is an intracellular pathogen and can live for a very long time within macrophages, where DNA insulting agents are constantly produced, we investigated the molecular machine that M. tuberculosis developed to repair the damaging occurring to its DNA. In particular, we determined the structure of two proteins involved in DNA repair, namely UvrA and OGT. Both proteins were studied in different conditions that also included the mutated forms of OGT observed in clinical isolates of multi-resistant M. tuberculosis. One of the major threats from M. tuberculosis is the constant increase of strains that are resistant to the available drugs and that need longer treatment with a higher risk of failure. The structural analyses provided a clear explanation for why the mutated forms of OGT that feature the resistant strains, are indeed incapable to efficiently repair DNA allowing the bacterium to more quickly mutate its genome and adapt/survive to the new environmental conditions: those determined by the administration of a drug. Moreover, thanks to a very efficient and productive collaborations within SysteMTb, we were able to demonstrate that OGT and UvrA form a stable protein-protein complex which was never reported before neither in M. tuberculosis nor in other organisms. This observation suggests that a cross talk between two major mechanisms of DNA repair, currently seen as acting independently from each other, could exist and represent a novelty in the biology of bacteria. Furthermore, we determined the crystal structure of three other proteins. One, PstS3, involved in supplying essential inorganic phosphate to the bacterium by importing it from the host and other two, malate dehydrogenase and citrate synthase, playing a key role in proving metabolic energy to the bacterium.

Mycobacterial protein complexes
In all organisms proteins rarely act alone but rather form various transient or stable complexes. Some of those protein complexes are very dynamic and change in response to environmental transformation. Therefore, the number and composition of the protein complexes in the cell reflects the physiological state of the organism. In order to understand the physiology of particular organism it is essential to decipher this complex interaction network. Information about protein-protein interaction may help to understand biology of e.g. a pathogen and further with a development of cytotoxic drugs which would target particular philologically important interactions. In case of Mycobacterium tuberculosis the knowledge about proteome organisation was very limited. Within SysteMtb we have developed an efficient procedure to identify interaction parents of proteins based on affinity purification and mass-spectrometry. In this study we use a very closely related, the attenuated vaccine strain Mycobacterium bovis BCG. BCG is a non-pathogenic bacteria with 99.95% genetic identity with M. tuberculosis. Work with BCG can be carried out in the laboratories that lack facilities required for the dangerous pathogenic strains.
Initially we optimised procedures using a limited number of targets which was followed by more global surgery of the proteome. Using our newly developed method we have targeted about 10% of mycobacterial proteins. To this end, selected proteins were cloned and at the same time modified by addition of a special tag, an enhanced Green Fluorescent Proteins (eGFP). This tag , along with a specific recombinant antibodies, anti-eGFP, allows efficiently purify proteins interacting with the targeted protein. The purified protein complexes were analysed used mass spectrometry to identify composition of the complexes (Plocinski P., et al, 2014, PLoS One). The analyses were concentrated on proteins involved in crucial pathways in mycobacterium tuberculosis e.g gene expression, basic metabolism, and cell division. We have identified several novel interactions and validated others which were predicted by evolutionary conservation. In some cases the identified complexes were analysed in more detail using structural and biochemical approaches.

RNA bound proteome
Stability of mRNA molecules encoding proteins play a crucial role in regulation of gene expression but this aspect of mycobacterial physiology has not been yet deeply explored. Proteins interacting with RNA governs the mRNA stability and directly influence the level of proteins transcribed in the cell. In order to intensify regulators of RNA stability we analysed globally proteins interacting with RNA in mycobacteria. We have cross-linked RNA with proteins in vivo thanks to incorporation of UV activatable nucleotide analogue what was followed by biochemical purification and identification of RNA interacting proteins by mass spectrometry. The analysis revealed that the most prominent RNA interactors are proteins involved in RNA decay, particularly components of the large macromolecular assembly called the degradosome. This prompted us to perform more detailed analysis of targets and functions of mycobacterial RNA degradosome. We have determined genome-wide targets of the degradosome subunits PNPase, RnaseE, RhlE, RnaseJ using a PAR-CLIP experiments which were supplemented by analysis of phenotypes caused by inactivation of the PNPase a main degradosome component. The data collected in our experiments provides the first insights into the major RNA degradation pathway in mycobacteria and identifies the key RNA targets for degradosome specific decay. Our data revealed that the degradosome has a pronounced effect on the transcriptome homeostasis. Interestingly M.smegmatis mutant depleted in PNPase shows apparent accumulation for blaC and Peptidase M16 (genes downstream PNP within the same operon) proving that autoregulatory mechanisms govern PNPase intracellular levels. Importantly our data suggest that the PNPase may be responsible for switching the mode of energy metabolism in fast growing mycobacteria.

Protein localization by light microscopy
Here, our focus was to create a pipeline of high- throughput, high-resolution methods for generating, imaging, and analyzing fluorescent protein fusions in vivo. We focused on the localization of a subset of library targets that could have a specific predicted function in the MTb cell cycle. In this direction, we set up a new methodology in order to obtain protein localization data that can help the modelling project within the cell cycle. To this end, we developed an experimental protocol for the protein localization, based on a modified agar pad method. Through a series of optimization iterations, we were able to implement the method in 96-well plate form, allowing a high-throughput / high-content approach. Information about Mycobacterium tuberculosis subcellular protein localization is important for protein function prediction and identification of suitable drug/vaccine/diagnostic target. In addition to localization efforts, these methods should benefit other types of functional genomic studies in Mtb. We have developed in the context of the project a FACS-based methodology which will be used for high throughput screening of protein-protein interactions.

Kinetic data of series of metabolic enzymes
The analysis of metabolites together with the detailed biochemical characterization of enzymes from the centre of metabolic rearrangement indicated that a reducing environment and interaction of Mtb enzymes with proteins from intracellular redox homeostasis regulate the carbon flux in Mtb. The detail knowledge of this switch will facilitate design and development of new types of tuberculosis drugs. The kinetic data of series of metabolic enzymes measured by biochemical SysteMTb partners were implemented into the cell cycle modeling and construction of metabolic model of Mtb growing under different conditions.

SysteMTb cloning facility and production of highly purified recombinant mycobacterial proteins
The consortium's cloning facility provided all the necessary vectors to the other SysteMTb groups and created a central library collection, available to the scientific community. The facility designed a set of Gateway-compatible destination vectors useful for the functional analysis of these Mtb ORFeomes. A high-throughput pipeline for generating an inducible vector library for the expression of Mtb fluorescent protein fusions was implemented and used for the quantitative genomic assessment of the distributions of both N- and C-terminal fluorescent protein fusions in Bacillo Calmette-Guérin (BCG). We obtained more than 3000 Mtb ORFs tagged to fluorescence protein in a form that is readily usable for performing quantitative genomic assessment of the distributions of both N- and C-terminal fluorescent protein fusions. Within SysteMTb we produced highly purified recombinant mycobacterial proteins for biochemical and structural investigations. We also used the purified proteins for the generation and isolation of specific poly- and monoclonal antibodies. The co-expression and co-purification of recombinant mycobacterial ACCase complexes was an ambitious challenge. After the construction of several different expression vectors it was possible to isolate strains that produce the corresponding subunits of at least three ACCase complexes. Procedures for co-purification of these complexes were developed enabling the enrichment of complexes for further studies. Enzymes involved in the tricarboxylic acid cycle (KREBS cycle) and the glycolyse as well in alternative pathways after cloning were produceds. Poly- and monoclonal antibodies towards proteins involved in the complex mechanism of cell devision and DNA binding targets - that probably function as transcriptional regulators - were generated.

Transcriptional characterisation
Tuberculosis bacteria respond to the type of environment that they experience during human infection and can be characterised in this way. While previous work has partially characterised such responses using one class of molecules (the “transcriptional” response), a key feature of our approach was to generate a holistic view by simultaneous characterisation at multiple levels. A major finding from this work is that the conventional approach of transcriptional profiling provides only a partial, and sometimes misleading, perspective. We were able to distinguish classes of RNA that differ in the way they interact with ribosomes to produce proteins, and important physiological adaptations of the bacteria that were independent of transcriptional changes. We were also able to improve our understanding of the ways in which chromosome structure can contribute to transcriptional regulation, and to assist in characterisation of the cellular machinery that degrades RNA molecules. The importance of this work is that it helps us to understand how the bacteria adapt during the process of infection, and thus to identify optimal strategies for discovery and development of more effective antibacterial drugs.

Metabolic network
We developed the core technology to quantitatively assess mycobacterial metabolism as a network of interacting elements. In particular, methods were established that enable i) to assess concentration changes of hundreds of intracellular metabolites (Figure 2) and ii) to track the movement of metabolic flux within cells. Furthermore, computational methods were developed for data integration and interpretation, enabling researchers to generate testable hypotheses on the intracellular operation of mycobacteria.
These technological advances have placed the SysteMTb consortium into a unique position to address open fundamental questions related to the unique survival strategy of mycobacteria in the human host. In various collaborative projects with mycobacteria specialists within (and beyond) the SysteMTb consortium, these core technologies were applied to many specific projects that range from understanding the metabolic and stress response of mycobacteria to commonly encountered host conditions all the way to identifying the nutritional basis of M. tuberculosis during infection of human hosts. One of the most frequent stress conditions for latent tuberculosis infections that affect an estimated third of the human population worldwide is oxygen limitation. Using our new analytical methods, we unravelled mycobacteria’s metabolic re-arrangements during the transition from exponential growth to oxygen limitation induced stop of growth, a non-replicating state in which these bacteria can survive for a long time and remain highly tolerant against chemotherapeutics. The discovered re-routing of metabolic fluxes identified unexpected potential chemotherapeutic targets against non-growing tubercle bacilli, holding the promise combat latent tuberculosis. Despite the discovery of M. tuberculosis as the causative agent of tuberculosis more than a hundred years ago, its diet during infection has remained unclear. In collaboration with tuberculosis experts in the consortium, we profiled the metabolome during infection of human macrophages with M. tuberculosis, representing the early stage of tuberculosis infection in the lung. Integration of the metabolite with transcriptional data with a genome-wide metabolic network model identified distinct nutrient exchange between the human host cell and the intracellular pathogen. The bacterial machineries enabling uptake and utilization of these nutrients are potential points of intervention to stop proliferation of infecting M. tuberculosis to avoid disease propagation at early stages after bacterial transmission.

Quantitative proteomic data for Mtb in disease-associated states
Proteins are the class of biomolecules that carry out and control essentially all biological functions and are the targets of most drugs. Therefore, the consistent and accurate quantitative measurement of proteins on a large scale is crucial to gain a better understanding of the molecular phenotype of a cell and for designing new intervention measures against pathogens. Due to the lack of suitable methods, proteins have, however, so far not been amenable to high throughput profiling in Mtb. Over the past years, targeted mass spectrometric techniques, such as Selected Reaction Monitoring (SRM) and SWATH MS, have emerged which use specific mass spectrometric coordinates of each protein to achieve highly consistent, sensitive and accurate protein quantification. In the context of SysteMTb, we established a strategy to build a library of high-quality SRM and SWATH MS assays for the entire proteome of Mtb. We then applied this strategy to develop and validate assays for tens of thousands of peptides of Mtb, representing 97% of all the 4012 annotated proteins of Mtb. This assay library called “The Mtb Proteome Library” is a powerful and worldwide unique resource that allows for the first time unbiased, high-quality protein measurements in Mtb and thus facilitates new research towards the elucidation of complex biological processes in Mtb on a system-wide level. Many disciplines in systems biology, such as mathematical modelling, benefit from absolute protein concentrations. We therefore also implemented a method to estimate for the first time absolute cellular protein concentrations of Mtb on a genome-wide scale based on large-scale targeted proteomic data. We applied this method to study the global remodelling of the Mtb proteome for example under hypoxia, a clinically highly relevant stress condition. The resulting data set provides an unprecedented inventory of absolute protein concentrations of over 2000 proteins and their regulation in response to stress. The data set is complementary to existing proteomic and transcriptomic data and provides novel insights into proteome composition and its stress-induced dynamic reorganisation. We furthermore monitored Mtb proteins during infection of macrophages and were able to obtain a time-resolved, molecular view of the intracellular life of Mtb during acute infection of its primary habitat, the human macrophage. Additionally, host proteins were measured to gain a more complete picture of the host-pathogen interaction during intracellular infection. In summary, in the context of SysteMTb, we developed the experimental resources and computational strategies and tools that are required to comprehensively and accurately quantify essentially all expressed proteins in Mtb and applied them to obtain unprecedented quantitative proteomic data sets of Mtb in important disease-associated states.

Lipidomics and Glycomics of the cell envelope
Mycobacteria are known to possess a unique cell envelope characterized by its lipid richness, up to 40% of the dry weight, with exotic structures, explaining in part the poor permeability of bacterial cells. In addition, these lipids are involved in mycobacterial virulence and their biosynthesis is the target of known antituberculous drugs (e.g. isoniazid). Furthermore, holistic polysaccharides are produced by mycobacterial to protect themselves against host attack. How the different lipid and carbohydrate compounds respond to various stress and environment changes can be addressed, provided the used of adapted techniques. Because of the complex structures encountered in mycobacterial cell envelope, dedicated analytical tools have to be set-up and used to decipher and quantify them. This requires the development of highly performance methods such as High Performance Thin-Layer Chromatography (HPTLC), Ultra-Performance Liquid Chromatography-Mass Spectrometry (UPLCMS). Outermost capsular polysaccharide components were characterized by Gas Chromatography-Mass Spectrometry, a method that was first validated for quantification. New analytical methods were developed to analyse the extraordinary complex lipids, primarily mycolic acids (C80-90 fatty acids), and quantify them. Using HPTLC we separated the main three mycolic acids of Mtb, which were further resolved in terms of chain lengths by UPLCMS analysis. The other lipids were analysed and quantified by HPTLC. About 80 lipid and carbohydrate constituents were analysed and quantified. The developed methods were sensitive enough to allow us to analyse not only the bacillus lipidome and glycome in vitro context mimicking conditions encountered in hostile environment (e.g. acidic pH, nitric oxide, hypoxia), but also into the macrophage and in an infected human lung. This results in a realistic picture of the adaptation of the cell envelope to various stresses (Figures 3 and 4).
Identification of inhibitors for leucyl-tRNA
For the rapid identification of inhibitors a new cellular screening system of an essential enzyme for Mycobacterium tuberculosis that could be used for the development of new antibiotics and as a tool to explore the dynamics of protein synthesis in this organism was set up. We chose to work with the enzyme leucyl-tRNA synthetase, an essential component of the protein synthesis apparatus of the pathogen. The assay developed is a cellular assay, ready for high-throughput screening activities that would allow confirming the inhibitory activity observed in the cellular system without confounding variables. Both the In Omnia assay and a classical enzymatic assay have been used for the characterization of potential inhibitors. The assay is in place and we used the system to screen potential inhibitors of the enzyme. We have also made the assay available to the research community, where it will be useful in the analysis of perturbations that affect genetic translation.

Data storage and exchange
To support the storage and sharing of the large amount of data generated in the consortium, a data storage platform was created. The raw experimental data of transcriptome, proteome, metabolome and lipidome are stored here as well as tables with consistent descriptions of individual experiments. Through the same platform a genome browser was made available where expression data could be visualized with the genome as coordinate system as well as an interactive visualization platform for protein-protein interactions.

iTuby: A tuberculosis pathways visualization tool
We used state-of-the-art web-technology and visualization tools to create a pathways visualization tool, completely tailored for Mycobacterium tuberculosis. The tool can be and has been used to visualize high-throughput data generated in the groups of SysteMTb in the context of metabolic, information and regulatory pathways.
Genome annotation
We used automated text-mining, orthology mapping and genomic context methods to provide functional annotations to a large number of protein coding genes that were so far only annotated as ‘hypothetical protein’

Dynamic model of central carbon metabolism
One of the major modelling efforts was the elaboration of a dynamic model of central carbon metabolism. This model was parameterized in a strong collaboration between theoretical and experimental consortium partners. Model simulations only agree with experimental data under assumptions of certain regulatory interactions. Comparison between these predicted regulatory mechanisms and literature indicates a high probability that these interactions exist in Mtb and are relevant for the bacterium to adapt to changing environments. The work on this model and the integration of information from this approach with whole genome metabolic models has yielded a new computational approach to bridge the gap between small dynamic and large steady state models. Regarding the details of our findings, we would like to emphasize the importance of interdisciplinary collaborations within SysteMTb. Each of the involved groups contributed unique expertise. The biological expertise and experiments were supplemented by theoretical work that integrated the different measurements. This step ensured that the understanding of the observed effects is consistent over the entire time span and over all different experiments. For this approach, we developed basic models to reproduce bacterial growth curves and test hypotheses on cell growth, survival, and death as well as complex models that integrate the entire bacterial metabolism. We paid special attention to a model of intermediate complexity that was fed by both the oversimplified as well as the overly complex models. It combines a sensibly parameterized model of general cellular processes that enable growth with detailed models of the mechanisms observed in experiments. This model is intended to help us understand why the mechanisms we observe take place and how perturbations influence cellular survival. Work on this model is still ongoing. The main effects that we observed are the following:
- Immediately upon resuscitation, the majority of cells that are able to grow die.
- Cellular metabolism responds immediately, especially a rerouting of carbon flux to sustain hypoxic dormancy seems reversed almost immediately upon re-aeration. The response of other pathways reflects their role in metabolism (efficient utilization vs quick detoxification).
- Expression of proteases seems to degrade proteins that were important for dormancy. Their building blocks are potentially recycled to produce new proteins.
Mechanisms necessary to protect from aerobic respiration are synthesized quickly.

Modeling metabolism and regulation of Mtb
During Tuberculosis (TB) infections, Mycobacterium tuberculosis (Mtb) resists the harsh environment of the human host and actively manipulates host cells to ensure prolonged intracellular survival. The access to on-going infection data from human hosts is hampered by the invasive nature of the procedures required to obtain the corresponding samples and so far, no in-vitro model precisely mimicking the host-pathogen interactions is available. It is, therefore of paramount importance to find alternative ways of characterizing the host-pathogen interactions during infection. We have built extensive computational models of metabolism and regulation of Mtb and we have integrated the models with experimental data generated within the consortium to accurately represent the infectious state. The genome of Mtb contains around one thousand genes encoding proteins whose function is directly related to metabolism. Genome-scale constraint based (GSCB) metabolic models form the foundation for the myriad of strategies that enable exploration of mycobacterial metabolism. Within SysteMtb we have followed an iterative strategy of modelling, hypothesis generation and experimentation to produce the most extensive reconstruction of the metabolism of Mtb. Our model contains information on 1192 reactions, 915 genes, and 929 metabolites. It has been built on the basis of previously developed models and has been thoroughly curated using available literature. This model has been quantitatively validated using data from growth experiments and 13C measurements. Pathways that are extremely important during infection such as the cholesterol degradation pathway, have been extensively curated. In addition, mycobacterial specific pathways such as the mycolic acid and dimycocerosate ester biosynthesis have been reconstructed in close collaboration with experts on mycobacterial lipids. To survive and cause infections, the bacterium has to adapt to the challenging conditions in the host. A huge amount of experimental data on the adaptation mechanisms of Mtb to different perturbations has been produced and independently analysed in the last decade. However, still a wealth of new information is contained in the aggregate of the data that cannot be extracted when analysing each dataset individually. We have developed an approach to integrate heterogeneous networks and generated a genome-scale transcriptional regulatory network to study the mechanisms which Mtb bacteria use to adapt to conditions in the host. Specifically, we have characterized the response to DNA-damaging conditions, low oxygen, high nitric oxide, low metal and co-factor availability, and limited availability of energy sources, which are some of the characteristics of the host environment. However, there is, so far, limited information on on-going infection. RNA sequencing (RNA-seq) allows to simultaneously capture the global transcriptome of host and pathogen. For mycobacteria, the relatively low number of bacteria per host cell results in a too unfavourable bacteria/host-RNA ratio. We developed a bacterial enrichment protocol to overcome these challenges. As a result dual RNAseq data with the most comprehensive transcriptome of intracellular mycobacteria to date has been generated. The integration of this dataset with the GSCB metabolic model and regulatory networks, allowed the identification of the importance of cholesterol degradation, iron acquisition and recycling of mycolic acids to the intracellular endurance of mycobacteria. It also allowed the study of the interplay between KstR, KstR2 (regulators of cholesterol degradation pathway) and IdeR (iron dependent repressor) and their critical effect during infection. The developed of this enrichment technique paves the way to using non-conventional samples, such as sputum samples in combination with the genome scale models to obtain a real time description of the behaviour of the pathogen within the human host. To further explore the observed interplay between iron homeostasis and pathogenicity we have extended the model of iron homeostasis by Ghosh et al. into a qualitative model including the main components connected to iron homeostasis in Mtb. A WikiPathways version of this model will be made publicly available to the Mtb community.
Extensive research has already been done on different factors: biological (e.g. sex, race, health and nutritional status, the concurrence of additional diseases such as diabetes melitus or HIV, and the drug resistance characteristics of the infecting strain), societal (e.g. gender and poverty), and geographical, among others, that influence the risk and the disease and treatment outcome of TB. The metabolic and regulatory models as well as the dynamic models were developed from semi-independent modules of a large multi-scale modelling framework that connects a variety of models at different scales, each describing a particular factor determining the outcome of the disease and paving the way to the grander vision of a model-based “Virtual TB Patient” with enormous potential to the identification of potential drug targets, to the identification of pathogen and host traits that impact the outcome of the disease and treatments, and to the development of new therapeutic strategies.
Potential Impact:
Tuberculosis, the re-emerging global health threat caused by Mycobacterium tuberculosis, costs EU 5.9 billion Euros per year and it is expected that the cost will rise with the rapid emergence of Multi-Drag-Resistant (MDR) and Extensively-Drug-Resistant (XDR) strains. Understanding biology of the tubercle bacilli’s physiology and pathogenicity is fundamental for developing new and effective strategies to combat TB. It will also assist in deciphering cross-talk between the pathogen and its host. The analysis of the very complex protein-protein interaction in the M. tuberculosis combined with computational and informatics strategies helped to find ‘weak points’ of the bug, the potential drug targets. The global results achieved by the SysteMtb project are the outcome of the integrated effort of multiple research groups in leading European laboratories. We were able to combine the different molecular readouts generated by separate perspectives into a holistic model which allowed to unveil key molecular interactions at the base of important biochemical mechanisms within the bacterial cell. The team effort overcame geographical and socio-political boundaries by establishing a network of highly trained scientists across Europe. The methodology developed for analysing protein-protein interactions using FACS could have a deep impact in this field. The results of the ‘in vitro’ mycobacterial culture measurements, the ‘in vivo’ macrophage cell culture infections and the systems data of the ‘ex vivo’ lung resections from human TB patients together with the novel theoretical models may help to predict and diagnose Mtb infection and disease progression of human TB in more detail. These findings may ultimately lead to new insights in TB drug and vaccine development. We have identified several novel checkpoints of Mtb infection that may be exploited to generate new intervention strategies – such strategies are urgently needed in light of the 9 million new clinical TB cases occurring each year. Rather than generating a series of separate transcriptomic perspectives based on different molecular readouts, we integrated a more informative holistic model that allowed us to identify key interactions that crossed conventional borders. Similarly in terms of human research potential, we established effective collaborative interactions that crossed national boundaries, establishing a network of young researchers that will have positive benefits which extends well beyond the timeframe and specific remit of the SysteMtb project. The lipidomics and glycomics data led to a better knowledge on the metabolism on Mycobacterium tuberculosis, specially its capacity to adapt to in vivo context. The ability to reliably detect and accurately quantify any protein of the pathogen’s proteome has not only benefitted SysteMtb itself but may also more broadly impact general Mtb research in the coming years by facilitating system level approaches addressing complex biological processes. Further, these tools may open new avenues in key application-related fields of mycobacterial research, such as vaccinology and drug discovery, to improve intervention and prevention measures for tuberculosis. SysteMtb has significantly improved understanding and prediction of mechanisms that fundamentally underlie the disease relevant metabolism, an important basis for rational development of new therapies and drugs. Extensive interactions between top mycobacterium labs and Systems Biology Departments within the project has generated a critical mass of EU research capacity of this important disease, and established strong links to the most relevant SB and Mtb initiatives worldwide. As a tangible outcome, several partners became leading members of national and international Mtb projects beyond the life time of SysteMtb. A specific example is a new large-scale Swiss project on Mtb drug resistance. Thus SysteMtb provided the fundament for a new, more systems-oriented research culture on tuberculosis. Extensive training of a new generation of researchers and encouraging their mobility has generated a new breed of researchers. Several of which are now starting their own career with ERC grants or are being hired into world-wide top labs as postdocs. The work which we carried out will affect scientists working in the field of Mtb biology and in particular regarding host-pathogen interaction, DNA repair and central metabolism. However, since some of the molecular systems we investigated led us to discover unprecedented mechanisms that turned out to be conserved in other bacteria and also in some cased in humans, our results have an impact also in other fields. In particular, the mechanism adopted by M. tuberculosis OGT to recognize damaged DNA is likely to be conserved in its human ortholog AGT, a protein that is a hotspot for the discovery of novel drugs to treat cancer. Along the same line is the observation that M. tuberculosis ZmpI turned out to be structurally very similar to human neprilysin, a major target for the treatment of the Alzheimer disease. Indeed some of the inhibitors we designed to block M. tuberculosis ZmpI showed also a significant activity against human neprilysin, making them interesting for further investigations in neuro-degeneration. Moreover, all our data are of general interest for both biochemists and structural biologists. The overarching joint SysteMtb experiment provides the first dataset that integrates several omics approaches consistently to enable quantitative modelling. This dataset enables any researcher to test and refine, and extend the hypotheses which we drew. We expect that this will quickly increase the understanding of Mtb, which will eventually underly new treatment strategies either by new drug discoveries or new temporal treatment schemes. The focus on resuscitation of the joint SysteMtb experiment sets it apart from most studies on hypoxic dormancy. Given a positive perception by the scientific community, this may lead to a greater diversification of efforts to understand mycobacterial survival strategies and build specific leading expertise in an emerging field of research. The successful collaboration between theoreticians and experimentalists in this consortium highlights the importance of systems biology initiatives and the power of combining specific expertises. We expect the consortium collaborations to be continued, such as the development of methods for model integration or the integration of different data types within a modelling framework. The tools for systems biology data visualization which we developed will provide deeper knowledge and understanding of the molecular process that are affected under different environmental conditions.
Dissemination
The SysteMtb consortium was in the fortunate position to organise a joint workshop on “Systems Biology of Tuberculosis” together with the “NIH Tuberculosis Systems Biology Program” at the TB Keystone Conference on 17 March 2013, in Whistler (Canada). Matthias Wilmanns (Partner 8b - EMBL) gave an introduction into SysteMtb and chaired the session with ca. 200 participants which initially comprised four SysteMtb speakers (one from each sub-project), as well as three additional speakers. Jeppe Mouritsen's talk was promoted to a very prominent plenary session with ca. 400 participants on 16 March 2013. SysteMtb members presented nine posters at the conference. The project manager disseminated the overall SysteMtb consortium's work with an own overview poster. Out of twelve SysteMtb attendants to the event eight were junior members and four were PIs. This event was a big success and significantly increased the international SysteMtb visibility. Several international workshops and meetings took place among the different partners. The SysteMtb management team created in collaboration with Partner 2- Kaufmann and Partner 13 - Klipp a popular scientific animation video for the project website (www.SysteMtb.bio) for informing a broad public about the project's results in modeling the cell cycle.
Exploitation
The generation of the Mtb fluorescence protein Fusion collection is a toolkit for a variety of systematic and large-scale localization studies in Mycobacterium tuberculosis. Approximately 3000 Mtb genes have been cloned with a C and N-terminal YFP tag. This library represent a remarkable resource for all the scientific community. The library is publically available to the scientific community at www.systeMtb.bio. We will also explore the possibility of patenting our FACS methodology for protein-protein interactions. Active molecules against M. tuberculosis leucyl-tRNA synthetase are being further characterized. If the chemical families that these molecules represent are found of interest in forthcoming analyses we will consider the possibility of applying for a patent to protect this intellectual property. Omics technologies developed within SysteMtb are currently being exploited for new Mtb projects, in particular those that deal with patient-specific Mtb strains in the field. Such projects would not be feasible without the ground work laid by SysteMtb. The SysteMtb project generated large-scale high-throughput datasets and methods that, following complete data analysis, will be deposited in public databases permanently available to the research community. Such resources are of high value as reference datasets but also act as seed for new projects for TB drug discovery, as well as vaccine and biomarker development. E.g. the data from lipidomics and glycomics were deposited in the STB databases. To facilitate accessibility and distribution of the mass spectrometric resources developed here, we established an Mtb build in several established proteomic databases: (i) PeptideAtlas is a database that serves as a general reference in the proteomic field. While we have initiated the Mtb build in this database, it includes now data of most of the major proteomic studies that have been performed in the past four years. (ii) The SRMAtlas and SWATHAtlas databases serve as repositories for the extensive assay libraries that we generated and allow easy access and download of all relevant information. (iii) PASSEL is a database that provides detectability information for all assays in unfractionated whole-cell lysates of Mtb. All these databases have been described in the corresponding publications and are publicly accessible. Additional data sets and computational workflows have partially been or will soon be published in peer-reviewed journals. Ten files of atomic coordinates of the different proteins we investigated have been deposited to the Protein Data Bank (www.rcsb.org) a public open access repository of atomic coordinates that can be freely downloaded and used worldwide. All these atomic coordinates are therefore available for carrying out activities of structure-based drug design. Indeed, we were able to design potent and specific inhibitors of ZmpI but adopting this approach. Moreover, protocols for protein expression, purification and crystallization are available for seven different targets and biochemical assays for all of them have been developed and are ready to be used in any screenings activity, even high-throughput, for the discovery of small molecules interfering with the activity of the specific target. The database of protein-protein interactions in Mycobacterium tuberculosis is available. The data generated within the joint experiment will be made available to the community through both direct download of the data and by a database as well as by publication. We will deposit the novel protein interaction in a public database, most likely in IntAct maintained by the European Bioinformatics Institute. Additionally, one result of our work on modelling central carbon metabolism was the necessity of creating a computational tool to ensure consistency of dynamic models with genome-scale models. This tool will be provided through publication. Our GSCB Metabolic Model has been included in the publication by Rienksma et al 2014 Seminars in Immunology in press. SBML versions 2 and 3 are provided as well as an xls file format. In addition it has been deposited in the biomodels database under accession number MODEL1411110000. The qualitative model on iron homeostasis and its link to pathogenesis currently being developed will be made publicly available as a WikiPathways version and sbml version will also be deposited at Biomodels upon publication. A Cytoscape plugin is currently being developed in close collaboration with the German based company LifeGlimmer GmbH with the main characteristics or our Data Integration, Visualization and Analysis tool (Diva) and we plan to make it publicly available during the next 12 months. The DevS/DevR regulator dynamic model describing how Mtb enters dormancy under oxidative stress or hypoxic condition is in progress and will be published in 2015.
List of Websites:
www.SysteMTb.bio

Centre for Genomic Regulation
PRBB Building
International and Scientific Affairs
C. Doctor Aiguader, 88
E-08003 Barcelona
Spain

E-mail: isa@crg.eu
Phone: +34933160374
Web: www.crg.eu
final1-20141203-1-ok-finalreport-figures.pdf