Skip to main content
European Commission logo print header

Array based sequencing-by-synthesis

Final Report Summary - ARRAYSBS (Array based sequencing-by-synthesis)

For an improved insight in to the structure and function of the human genome there is a need for comparative whole genomic analysis. The research today is based on analysing and comparing different portions of the human genome and draft versions of the whole genome and there is an obvious need for much more complete data sets to identify the minor variations that makes us all individuals. These variations have been shown to be linked to different diseases and the response to drug treatment. The cost of current sequencing methods will continue to severely limit the amount of data that can be produced for clarifying the genetics of human health and disease. Current methods are therefore likely to miss rare differences and will have limited ability to determine long range information. Sequence analyses of microbial genomes, like bacteria viruses and fungi have become a powerful tool for identification and characterisation of these organisms. With the increasing outbreaks of infectious diseases like salmonella, tuberculosis and HIV and the high frequency of multi-drug resistance among several of the causative organisms a fast cost effective sequencing tool will help the medical community in both diagnosis, containment and treatment.

Future demands for understanding, diagnosis, treatment and prevention of diseases will create a need for DNA sequencing platforms that are faster and significantly more cost effective than the alternatives currently available. It is the aim of this project to develop methods and components for a sequencing-by-synthesis approach with the potential of fulfilling these needs. The core group of three biotech SMEs, located in Sweden, Estonia and Lithuania, have essential and unique competencies for developing these new DNA methods and components but some key elements are missing and need to be found outside these companies. Steps involves development of four dNTPs with blocked 3'-end, isolation and selection of a DNA polymerase that accepts these, a microfluidic device as reaction chamber, bioinformatics and validation of the system The SMEs represent extensive experience in the area of DNA arrays with spotted primers for mutation detection, organic chemistry of modified nucleotides, and isolating and selecting DNA modifying enzymes. The areas where additional expertise is needed are: i)DNA polymerases interaction with modified dNTPs, ii)microfluidics, iii)bioinformatics, specifically related to primer selection and handling the DNA sequence information generated. Three RTD performers from Germany, Sweden and Estonia with excellent knowledge in these three areas took part in the project. It was expected, that with the support of this project, the SME group will succeed in this development of the new approach, based on an array platform with thousands or more of oligonucleotide features. The major technical risk concerns the identification of a DNA polymerase that accepts the modified nucleotides, and the possibilities to improve its performance. A positive outcome of the project must be followed by commercialisation that will involve the development of a fully functional and marketable product. For this phase the SME group may choose to involve other parties.

Studies on how DNA primer properties influence the results from primer extension microarrays were made. The analysis was done on experimental data from genotyping microarrays, which is likely to behave very similarly with the developed sequencing-by-synthesis platform. Advanced statistical methods were used to evaluate which primer properties are responsible for strong fluorescent signal and low failure rate. Based on these tests statistical models for prediction of call rate and signal intensities of extension primers on microarrays could be developed. A microarray normalisation method was developed which can reduce variation between microarrays and helps to design bade-calling algorithms during the second year of the project. A web based prototype has been created for automatic design of re-sequencing primers with strong signal and low failure rates. After developing an initial software prototype the work was continued with adapting the prototype for re-sequencing. For this purpose several algorithms have been tested and it was found that the most appropriate algorithm for genotyping is the EM algorithm, which uses equal shape and volume of genotype clusters.

The intensive collaboration among the researchers led to three complete reversible terminators, which are all incorporated by several mutants of the DNA polymerase. The 3'-O-cyanoethyl function of these terminators is fluoride-cleavable, and also the dye-linker can be removed under the same conditions. This is proof of the demanded reversibility of these terminators.