The present standard commercially available systems will not allow to meet the time and price goals of large Genome projects. In this proposed project will be demonstrated the performance of a novel device design achieving high accuracy, improving throughput by a factor of 5 to 10, and lowering the cost by the same factor, as compared to the standard systems. The modular design will in the future allow further increase in the throughput to 1 megabase per device per day, since it is today feasible to increase the density of the array detector modules by a factor of 2 to 4. To meet the demands of the Genome projects, highly accurate long reads allowing low redundancy, high throughput DNA sequencing systems together with optimised biochemistry, resulting in much lower sequencing cost, are required. The realisation of an ARAKIS production sequencer for up to 200 samples per run and demonstration of its unmatched performance in large scale routine applications, are the main objectives of the proposed project. This demonstration project is transversal to all biotechnology areas (in particular to new methods for Genome analysis, 2.1 Sequencing and 2.2. Function search) requiring the use of genetic information.
The Arakis is based on the concept of the routinely working novel EMBL Doublex Automated DNA sequencing prototype (design principle and DNA sequencing biochemistry covered worldwide by several patent applications). The system was developed recently at EMBL with funding from the EU Biomed Programme. It gives at present the longest reported readouts (up to 1400 bases per sample), the highest throughput (up to 500 kilobases per device per day) and the highest accuracy resulting in low redundancy (around 2). The multiplex system allows the simultaneous on-line sequencing of two to five DNA templates in one single sequencing reaction. Per sequence reaction up to 7000 bases are thus obtained, 5 to 10 times more than the commercially available systems. The large bottleneck of the automated DNA sequencing, namely the sample loading, has been solved by the porous combs technique developed recently at EMBL, enabling simultaneous loading of 200 samples. The technology, demonstrated in the course of this project, will reduce DNA sequencing costs by a factor of 5 to 10, increase the throughput by the same factor with significantly higher sequencing accuracy (99,8% up to base 500 and 99% accuracy up to base 1000 for double stranded templates per single read), compared to standard commercial devices. Other systems with such performance have not been reported and are not commercially available.
To demonstrate the power and advantages of the new sequencing system in biotechnology, it will be applied to several projects in large scale sequencing, e.g. partial sequencing of microbiological genomes, full length sequencing of cDNAs, genome mapping by end-sequencing of large vectors (BACs, PACs) and gene expression analysis (expression profiling by Serial Analysis of Gene Expression, SAGE) in which 5 to 10 times more information per gel is obtained, shortening correspondingly the projects. From the four multidisciplinary combined partners, one is the technology producer (EMBL, W.Ansorge). The system will be applied to routine projects at the sequencing facility Genoscope (J.Weissenbach) Paris, which is well established in biotechnology and genome analysis, and at University of Amsterdam, (H.F. Tabak), which is experienced in gene expression analysis in yeast and human with the SAGE technique. The start-up company LION will use the system for large scale sequencing in an industrial environment.
The system demonstrated in this project will enhance the attractiveness of this novel technology approach to industrial commercialisation and in service companies, and would give European industry competitive advantage over US and Japanese companies. The partners in this project form the only European collaboration competing at present successfully with the US and Japanese industry in the development of automated DNA sequencing technology.
Keywords : Automated Doublex DNA Sequencing, high throughput sequencing, SAGE
Funding SchemeCSC - Cost-sharing contracts