Skip to main content
European Commission logo print header

Next Generation Genome Based High Resolution Tracing of Pathogens

Periodic Report Summary 2 - PATHONGEN-TRACE (Next Generation Genome Based High Resolution Tracing of Pathogens)

Project Context and Objectives:
Next Generation Sequencing (NGS) has fundamentally altered genomic research. The rapid development of this technology has enhanced performance and reduced DNA sequencing costs, widening the spectrum of possible cost-effective applications. The high potential for ultra-fast and accurate molecular typing and diagnostics that NGS promises is being exploited in the medical diagnosis of pathogens. In this context, PathoNGen-Trace (PNGT) aims to develop Next Generation Sequencing and next generation DNA analysis tools into a highly efficient and effective technology for diagnostics and the high-resolution typing of microbial pathogens. An international consortium of leading experts in the field of clinical microbiology, such as the Muenster University, the University of Oxford, and the Research Center Borstel (scientific project coordinator), are working together in the project with well-known commercial enterprises in this area - Applied Maths NV, Genoscreen SAS, Piext BV, and Ridom GmbH.
Three pathogens are being used as models – Mycobacterium tuberculosis complex, methicillin-resistant Staphylococcus aureus (MRSA), and human-pathogenic Campylobacter species. All three pathogens are major causes of human disease worldwide, posing serious medical threats and important challenges for treatment and public health. One third of the world’s population is currently infected with tuberculosis (TB), with new infections occurring at a rate of about one per second globally. Although it remains latent in most cases, over nine million cases of active TB were reported in 2009, with about 1.7 million fatal cases. MRSA is the major cause of hospital-acquired infections, often affecting surgical intensive care units, burn centres, maternity wards and special care baby units. Finally, the pathogens of the genus Campylobacter (C. jejuni and C. coli) are the major causes of bacterial gastroenteritis in humans worldwide and are transmitted from animals to humans especially, but not exclusively, in retail food products. In Europe they are one of the main causes of bacterial intestinal infections, rising above the number of Salmonella infections for the first time in 2005. While classical genotyping has been applied to understand transmission and population structure of these pathogens, recent whole genome sequence (WGS) investigations showed the conventional approaches lack discriminatory power and do not reflect the full level of genome diversity.
The PathoNGen-Trace project aims to overcome drawbacks in conventional typing and diagnosis with NGS-based genotyping for discrimination of clinical isolates at genome wide level. In parallel to the development of new tools, the consortium will also determine yet unknown quantitative parameters of genome evolution on a population basis, in order to calibrate and validate NGS for pathogen genotyping and epidemiological tracing. The main objectives of our project are: (i) to develop new, completely integrated bioinformatics tools for fast and easy quality-controlled data extraction and interpretation for general diagnostics (e.g. drug-resistance) and public health applications (genomics-based molecular epidemiology); (ii) to streamline and implement new internal quality control procedures of the whole NGS process, from sample preparation protocols to final sequence assemblies or mapping; and (iii) to test and validate the performances of NGS for ultra-sensitive/early diagnostics and to monitor the spread of major microbial pathogens.

Project Results:
The major objectives for the project are: (i) the development of easy to use software tools for microbial whole genome comparison; (ii) the improvement of NGS sequencing based whole genome analysis workflows and development of improved NGS kits; (iii) the comparison of different NGS platforms and optical mapping for genotyping of major human pathogens; and, (iv) the determination of optimal parameters and in depth evaluation of NGS genotyping and optical mapping for NGS genotyping to monitor the spread of MTBC, MRSA and CAMPY.
The major results of the five scientific work packages are:

WP1:
Easy to use NGS analysis tools have been implemented in the two software packages SeqSphere+® (Ridom GmbH) and BioNumerics® (Applied Maths NV). Different bioinformatics pipelines have been tested for comparing performances on genome assembly/mapping and SNP/InDel detection. A bacterial core genome definer and genome wide gene-by-gene typing method was developed, integrated and tested on MTBC, MRSA and CAMPY into the SeqSphere+® software. Furthermore, an assembly pipeline and a plain-language reporting system from WGS data for physicians, epidemiologists and researchers have been implemented in SeqSphere+®. A command-line free bio-informatics pipeline for the assembly and analysis of NGS was implemented in BioNumerics® (quality control, assembly, wgMLST allele analysis). A high-throughput BioNumerics Calculation Engine was developed and tested in a real setting, able to analyze any number of genomes within one hour. Seven different ways to analyze whole genome sequences have been compared using the same platform.

WP2:
Work performed in WP2 comprised the assessment of the sequencing platform and sample preparation as well as the set-up of quality control. Comparison of benchtop platforms led to the selection of the MiSeq platform. Nextera XT kits have been selected as library preparation kits allowing high multiplexing levels (96 for CAMPY, 48 for MRSA and MTB). Using these optimized settings, large collections have been sequenced (see below) for analysis in WP3-5. Evaluation and optimization of whole bacterial genome amplification protocols and selective genome target capture technologies are ongoing.

WP3:
As a key result, potential of and parameters for whole genome analysis (WGS) for improved molecular-guided tuberculosis (TB) surveillance could be defined by NGS analyses of M. tuberculosis (Mtb) strains from a longitudinal outbreak and a study on the global spread of Mtb Beijing strains. To further define optimal parameters for WGS based pathogen tracing and micro-epidemics studies, genome sequencing of a large retrospective collection from a longitudinal molecular epidemiological study was performed (947 strains from 8 consecutive years) and data obtained are currently analysed. Finally, a genome wide gene-by-gene approach using the Ridom SeqSphere® was developed and successfully implemented.

WP4:
A comparison of bench-top NGS machines performed by GSR and WWU for S. aureus, M. tuberculosis and E. coli which allows the consortium to choose the most appropriate platform for future work has been carried out. Furthermore, de novo assemblers have been systematically evaluated. 200 MRSA isolates with good epidemiological data (blood culture and zoonotic MRSA from North-Rhine Westphalia in 2011) were sequenced retrospectively. Prospective genome sequencing of all multi-resistant bacteria isolated between 15th of October 2013 and 14th of April 2014 (in total 6 months) at the University Hospital Münster, Germany of partner WWU was done (in total 667 isolates).

WP5:
A rapid high-throughput sequencing pipeline has been developed for the generation, assembly, analysis and dissemination of Campylobacter WGS data and used to characterise large numbers of isolates from a well-defined isolate collection. On-going validation of the data from the 726 isolates investigated to date, demonstrate very high levels of accuracy and effectiveness. There is good concordance of WGS data with optical mapping technologies, although the latter are most effective in determining genome rearrangements and are relatively insensitive for the detection of nucleotide sequence changes.

Potential Impact:
PathoNGen-Trace will foster the development of new and widespread applications of NGS in clinical microbiology and disease surveillance, ranging from basic research to medical research, diagnostics, and pathogen genotyping. The NGS kits, methodologies, and software developed for highly effective diagnostics and genotyping of major pathogens will directly impact patient management and disease surveillance, and thus reduce health-related costs. In line with foreseen future progress in NGS technologies, this project will promote a major technological shift towards the replacement of the current capillary sequencing by NGS for these applications.
The development of new tools/technologies in this SME-targeted project will overcome existing obstacles for large-scale use of NGS by European clinical microbiologists and scientists, thus fostering competitiveness of Europe in biomedical applications. Importantly, the tools will be developed under formats as generic as possible, to be applicable to a wide variety of micro-organisms. In particular, the proposed new NGS research tools will significantly enhance data generation (e.g. better NGS workflow, new technologies) and improve standardisation (ontology, APIs, and kits), quality control (algorithms) and analysis (new bioinformatics tools). Moreover, the development of the necessary tools for a much wider application of NGS technology will enable the generation of an enormous amount of new knowledge in the European region, and is crucial for increasing the competitiveness of Europe in the areas of "-omics" research and systems biology. Beyond these new key applications and validation of NGS for medical microbiology, we anticipate that the tools developed will also be usable for other relevant NGS-based applications such as comprehensive microbial genomic characterization for bio-banking, bio-processing and synthetic biology.

The tools and knowledge developed by PathoNGen-Trace will bring modern molecular surveillance to the outmost ultimate level of resolution by using whole genome data for pathogen tracing. It will provide software solutions for easy quality-controlled strain classification based on NGS data. Rule-based systems will assist to extract information relevant for clinical microbiology e.g. resistance or virulence determinants. Based on and linked to existing strain typing platforms (e.g. PubMLST.org SpaServer) quality controlled nomenclature servers for NGS pathogen tracing will be established. Due to the development of generic tools and databases, PathoNGen-Trace is likely to continue and extend the success of current typing databases (PubMLST.org SpaServer) and link it to future European surveillance programs from the European Center for Disease Prevention and Control (ECDC).
Development of such new NGS tools for studying population structure/genome evolution of a variety of pathogens by a wide user community will generate an explosion of knowledge on transmission dynamics, population structure and genome evolution. This will promote improved European disease control measures and a better understanding of virulence and resistance traits of major pathogens.
Therefore, the added value for European citizens directly resulting from the project results will be: (i) more comprehensive detection (and therefore better treatment) of relevant pathogen infection parameters at patient level (ii), more specific and sensitive early warning microbial outbreak detection at public health level and therefore more effective combating and prevention of pathogen epidemics (esp. multi-resistant ones). Thus, PathoNGen-Trace is likely to create a significant long lasting "added value" in the European health and bio-industrial research area.

List of Websites:

http://www.patho-ngen-trace.eu/