High Throughput Systematic Single Cell Genomics using Micro/Nano-Fluidic Chips for Extracting, Pre-analysing, Selecting and Preparing Sequence-ready DNA

Final Report Summary - CELL-O-MATIC (High Throughput Systematic Single Cell Genomics using Micro/Nano-Fluidic Chips for Extracting, Pre-analysing, Selecting and Preparing Sequence-ready DNA)

Executive Summary:
The Cell-O-Matic consortium has established a lab-on-a-chip-based technology for the analysis of DNA from single cells. Using the lab-on-a-chip platform, we established and optimized a fast, simple protocol for extraction and amplification of DNA from single cells resulting in state-of-the-art next generation whole genome sequencing. We also demonstrated the optical mapping of genomic DNA extracted from a single cell. Furthermore, we demonstrated that the same type of lab-on-a-chip platform can be used to separate cancer cells and white blood cells using pinched flow fractionation at efficiencies over 90% [Pødenphant et al Lab Chip 15, 2015].
Moreover, Cell-O-Matic demonstrates that complex lab-on-a-chip devices for single cell and single molecule analysis can be produced by scalable industrial processes (WP4) that can compete with existing production technologies such as silicon fabrication used in state-of-the-art commercial instruments (BioNanoGenomics – optical mapping) or elastomer casting (Fluidigm – single cell genomics). Cell-O-Matic has also delivered a prototype instrument (WP7) that uses the lab-on-a-chip single-use device and provides epi-fluorescence and bright field microscopy, microfluidic pressure control and temperature control to allow trapping of single cells, extracting DNA and subsequently amplify the genetic material.
In one application (WP3), we used the lab-on-a-chip to efficiently stretch megabase lengths of genomic DNA from a mixture of chromosomes, to map AT/GC composition by denaturation-renaturation and identify structural variations by comparison to a reference [Marie et al PNAS 110.13 2013]. The approach was automated [Sørensen et al. Review of Scientific Instruments 86, 2015] and mega-base lengths of DNA extracted from single cells were analysed and mapped to the reference genome (WP6).
In our main application the lab-on-a-chip instrument was used to prepare DNA from single cells for next-generation sequencing (WP2) or single DNA mapping (WP3). The consortium successfully developed new on-chip DNA amplification protocols compatible with the very low amounts of DNA from a single cell (6 pg) (WP5). We found higher coverage when single cell sample material was produced on-chip compared to off-chip.
In total, over 100 single-cell samples were processed with 97% producing a library that was suitable for next-generation sequencing (WP8). For 30% of these, 85% or more of the genome was covered 15x. Mapping data showed an average of 98% of reads mapped to the genome, a significant achievement compared to the literature that reports a substantial amount of read contamination. Allelic dropout was shown to be highly dependent on the sample’s average coverage. We obtained allelic dropout values as low as 29% with potential for further improvement by increasing average coverage. These values compare positively to the current state-of-the-art for comparable lab-on-a-chip devices. Cloud resources were developed for NGS analytical purposes, and an efficient tiling method was created for fast reading of bulk NGS data (WP6).
A pathway modelling approach previously applied to microarray data was adapted to both bulk and single cell data for interpreting cancer expression (WP9).
At the close of the project on December 31, 2015, the consortiums efforts in 21 scientific publications on subjects ranging from instrumentation development to personalized medicine, thirty talks at conferences, workshops, symposia, four filed patents and nine instances of media exposure (WP10). Training was provided in the use of the instruments, fluidic devices and protocols developed in the project. Discussion, monitoring and planning for the exploitation of the results of Cell-O-Matic was undertaken which facilitated integration of various efforts within the project to produce a system for single cell analysis and sequencing as well as a number of ancillary products and services; indeed software and cloud resources are already being commercially exploited. A document that analysed the state-of-the-art and prospects for the relevant markets and mapped out a roadmap for exploitation of Cell-O-Matic results was produced.

Project Context and Objectives:
Cell-O-Matic intends to fill the gap existing in the development of single cell genomics between collecting a sub population of cells as for example a biopsy of a tumour and characterising the genome of individual cells via sequencing. This gap exists because there is no straightforward way of extracting cells of interest out of a heterogeneous population; extract the genomic material and preparing it for sequencing. Cell-O-Matic intends to implement, in a lab-on-chip device, isolating cells of interest and preparing DNA of individual cells as a readily usable sample for a sequencing instrument. Moreover, in order for the Cell-O-Matic technology to take over traditional sample preparation methods in the research labs and hospitals, the lab-on-a-chip device has to be mass produced and single use, leading the Cell-O-Matic consortium to using an industrial production process, injection moulding.

Project Results:
WP2:
The main objective of Work Package 2 was to provide single-cell DNA samples of sufficient quality to enable whole genome sequencing (WGS). Using our biological cancer resources, initially the LS174T colorectal cancer (CRC) cell line, with known mutations and mRNA profiles, and injection moulded microfludic devices designed and manufactured by DTU and NILT, we developed and optimized procedures for on-chip single cell capture and lysis, and DNA extraction and amplification. Several batches of trapping devices varying in channel dimensions and trap design were tested. TRAP8 chips were chosen as the most efficient in trapping single cells.

Extensive tests using several biological samples (colorectal cell lines, primary cell cultures and fresh tumour sample), differing in cell size and tendency to form aggregates, showed that routinely 6 to 8 cells were captured. Over 55% of trapped cells were observed as single cells. Cells were lysed to obtain only DNA On chip alkaline lysis of nuclei was carried out followed by Whole Genome Amplification (WGA) using Multiple Displacement Amplification (MDA) technology. The quality of WGA products was confirmed on agarose gel after multiplex PCR targeting five positions on five chromosomes. Only samples showing at least four bands on the gel were sent for WGS. Finally, the whole procedure of obtaining sequence-ready DNA samples from the same single cell was applied to the Cell-O-Matic instrument. In total, 88 on-chip DNA.

The second area of work for WP2 was pinch-flow sorting of cancer cells from blood cells. To achieve this we designed pinched flow fractionation (PFF) devices (Figure 2.1) [Pødenphant et al Lab Chip 15: 4598-606 (2015)].

We chose LS174T cells to model CTCs. We observed that there was no overlap in size between the red blood cells (RBCs; 6-8μm) and LS174T cells (11-16μm) but there was a significant overlap between the white blood cells (WBCs; 7-15μm) and the cancer cells. We anticipated that this overlap could preclude the ability of pinch flow to separate the WBCs from cancer cells but it should be efficient in sorting cancer cells from mixture with RBCs. To test this we subjected mixtures of cancer cells with either red or white blood cells, where each cell type was stained with a different fluorescent dye, to a PFF device. As expected, LS174T cells could be efficiently sorted from RBCs, with enrichment of approximately 1:10,000. In the case of mixture with WBCs we showed that there is a significant difference in critical diameter between WBCs and cancer cells (Figure 2.2) which is probably due to different cell deformability and/or shape of the two cell types.

This difference in critical diameter is an advantage and resulted in a recovery of 96% cancer cells together with a removal of 93.6% WBCs. This is a good separation that should allow for isolation of CTCs. However, the highest separation efficiencies were obtained at a sample flow rate 10µl/h, which is too low for applications where at least 10mL of sample must be sorted. At higher flow rates the cancer cell recovery was unaffected but the WBC removal was decreased.

WP3:
The overall goal of work package 3 was to develop methods for mapping single molecules from single cells, using an injection moulded micro-/nano-fluidic device, and connecting the map with data from single cell sequencing. A range of developments have been made according to the Work Package 3 work plan in order to achieve this goal.
• The principle of using a cross-flow nanofluidic chip [Pedersen et al. Phys Rev E (2016)] to efficiently stretch megabase lengths of genomic DNA from a mixture of chromosomes, to map AT/GC composition by denaturation-renturation and identify structural variation by comparison to a reference was established and published [Marie et al. PNAS 110.13 4893–4898 (2013)] (figure 3.1).
• On-chip extraction of DNA from trapped cells was developed and implemented
• Methods for on-chip amplification were developed and implemented
• A novel method for partitioning extracted DNA into lengths of around the megabase scale was developed
• Methods for tagging sequences and methylation sites along the length of genomic DNA were developed
• Automation was developed to selectively move large fragments of extracted DNA into nanofluidic channels; this work was published [Sørensen et al. Review of Scientific Instruments 86, 063702 (2015)]
• Genomic DNA was moved into nanofluidic channels via a combination of pressure- and electrical-driven flow
• A substantial fraction of the genome from a single cell was mapped according to AT/GC composition
• The map pattern from single-DNA molecules extracted from single cells was compared to reference and structural variants were identified
• The regions of the single cell genome where structural variants were identified via mapping were analysed in single cell sequencing data and the verity of the mapping data was validated.

WP4:
Cell-O-Matic has advanced the state-of-the-art of combined micro- and nano-fabrication by injection moulding. WP4 established a production-ready lab-on-a-chip platform combining single cell microfluidics and single molecule nanofluidics (Figure 4.1). In particular, the platform can combine up-to 3 different fluidic levels such as single cell microfluidics (up to 35um deep), low aspect ratio nanofluidics (80nm deep, 20um wide) and square cross section nano-wells (below 100nm wide) thus demonstrating the platforms capability to integrate the functionalities necessary for a lab-on-a-chip to process single cells and single molecules. Figure 4.2 shows such a nickel shim combining 3 levels and the chip replicated in polymer.

The shim is commercialised by project partner NILT at a typical selling price of 10.000 EUR. Replication in polymer is made by injection moulding in TOPAS 3015 and the final chip is made by sealing the injection moulded fluidic channels with a transparent foil lid, which is suitable for high NA imaging using oil immersion optics.

WP5:
Work package 5 focused on processing of DNA from the very low input a single cell can provide (6 pg) off-chip. The work performed covered sample amplification, fragmentation and sequencing library preparation. Finally second generation sequencing methods were used to assess the quality of the established workflow and a comparison of the obtained performance between the two sequencing technologies was performed.

Sample amplification
6 pg of genomic DNA was successfully amplified in order to obtain enough DNA material for library preparation. We have demonstrated that multiple displacement amplification protocols are compatible with on-chip amplification and can be used in this context with the buffer constraints required for on-chip amplification. Several methods were investigated to reach this conclusion. PCR-based and MDA-based technologies were investigated. PCR-technologies showed a great amplification potential. But in the case of a lab-on-chip application, the required temperature cycling increases the complexity of the platform holding the chip. Moreover PCR-based whole genome amplification showed to be biased when sequenced, with few parts of the genome being readable at sequencing. The MDA technology showed more moderate amplification power, with the additional advantage that it is operational at constant temperature (isothermal amplification). This characteristic is important in our lab-on-a-chip project, therefore simplifying the instruments characteristics required. The whole genome amplified products were sequenced and showed a three-time superiority of MDA amplified samples versus PCR-amplified samples for genomic DNA coverage. These observations led the consortium to select of MDA-based DNA amplifications methods for integration in the final workflow.

DNA fragmentation
As part of the sample preparation protocol, the consortium investigated ways to efficiently fragment very small amounts of DNA when still being compatible with a lab-on-a-chip setup. The investigations led to the successful fragmentation of genomic DNA isolated from a single cell in a microfluidic chip. We have investigated different methods for DNA fragmentation such as sonication and microfluidic shearing however they showed incompatibility with the lab-on-a-chip design or showed a low efficiency and we decided to opt for a photochemical procedure that was easily integrated into protocols implemented in the fluidic devices.

Developing DNA library protocols for second generation sequencing
After the amplification and the fragmentation of single-cell genomes, protocols for second generation sequencing library preparation were developed. Different solutions were benchmarked based on their performance with low amounts of starting material, compatibility with the envisioned workflow and the capacity to generate close to unbiased libraries. Technologies based on enzymatic methods to break the DNA, such as “fragmentase” were not considered as they are known to induce bias in the library. Transposon-like technology was investigated as this technology limits damages done toDNA and it can be done on small amounts of starting gDNA. Our results show that second generation sequencing libraries can be produced off-chip from input samples as low as 1 ng human whole genome amplified DNA. The quality of the generated library was good enough to perform sequencing. For most of the sequenced samples the mapping results show good alignment of the reads on the human genome and a good quality of the aligned reads. However, for these off-chip single cell DNA extraction the coverage results show a high proportion of non-covered bases which reaffirms the need for on-chip extraction and amplification when starting material is from a single cell.

Test of second generation sequencing
To test the second generation sequencing of DNA samples, the consortium used an oncogene panel to perform locus-specific DNA sequencing from single-cell material. This task used single-cell DNA material to perform an initial test of whole genome sequencing using the Illumina technology. The goal of these activities was to obtain libraries of sufficient quality to perform sequencing and data analysis. The consortium worked together to achieve this goal and demonstrated reliable library preparation from single-cell amplified DNA extracts. The sequencing showed moderate quality of data obtained from off-chip single-cell experiments while on-chip single-cell experiments performed in WP8 showed a promising improvement of data quality. With respect to these results, the partners of the Cell-O-Matic consortium decided not to perform deeper sequencing from off-chip experiments as initially planned in the description of work and focus efforts for on-chip experiments.

Cross-platform compatibility
The consortium compared Thermo Fisher Scientific Ion Torrent sequencing datasets from single cell off-chip and on-chip prepared genomic DNA samples, in order to assess the performance of an alternate sequencing technology with the workflow determined for single cell genome analysis. All the biologic material originated from the LS174T colorectal cell line. The Cell-O-Matic consortium analysed the cross-platform compatibility of the developed microfluidic chip-based DNA isolation instrument and developed protocols. Ion Torrent Cancer Hotspot Panel v2 sequencing datasets from single cell DNA prepared on and off-chip versus bulk DNA, all from LS174T cell line, were analysed and compared. This provided the opportunity to investigate the DNA isolation platform performance using another type of sample processing and sequencing technology. The outcome of this analysis is showing equivocal technical feasibility of using this technology with single cell on-chip isolation, as results show lower coverage of the gene panel and higher unspecific read ratio versus off chip extracted DNA. Nevertheless, the very low number of on-chip datasets available can only lead to ambiguous performance interpretation, advocating for further single cell on-chip DNA sequencing datasets analysis to strengthen our conclusions.

WP6:
A software/ database suite BC|Genome has been commercialized by Cell-O-Matic partner BCP and was used as a platform for software development relevant to the project.

Single DNA molecule denaturation-renaturation (D-R) mapping
Cell-O-Matic has established a nanofluidic device to stretch and image Mb-long genomic DNA molecules (see WP3). By using with denaturation-renaturation patterns representing AT/GC composition, single DNA molecules can be mapped to a reference whole human genome and structural variation detected (WP3). WP6 has developed a software platform to process the experimental data. Construction of theoretical denaturation-renaturation (D-R) maps was done with the software bubblyhelix (http://www.bubblyhelix.org). An algorithm was developed for extracting experimental D-R maps from movies of DNA molecules stretched in the nanofluidic device. The algorithm takes a number of overlapping fields-of-view and localizes the DNA on the image.

The developed software includes various controls, which allows the user to judge if the DNA is localized correctly in the image, and if the stitching is done properly. The experimental D-R map is localized in the reference genome by either a) considering a single barcode (e.g. 125 kilo-base pairs) and calculating the best match, or b) using the Sliding Window Method. For the latter, the whole DNA molecule in the nanoslit (~1.3 mega base-pairs) is used for the localization of the DNA in a reference genome, and has proven to be more robust than method a).

The algorithms for analysing D-R maps were developed and tested. The algorithms were integrated into the BC|Genome software suite commercialized by Cell-O-Matic partner BCP. Within BC|Genome, the barcode analysis workflow consists of the following steps:
1. Uploading sets of DNA denaturation barcode images to the BC|Genome database
2. Creating DNA barcodes from selected sets of uploaded images, using algorithm developed in work package 6.2
3. Creating databases of theoretical melting maps from reference genome sequences, using the publicly available program Bubblyhelix (http://www.bubblyhelix.org/)
4. Graphical User Interface (GUI) for localizing a DNA barcode within a selected reference sequence, using algorithm developed in work package 6.2.
All of these functionalities are made available through GUI implemented within the BC|Genome software platform. Barcode localization and reference database generation are parallelized to be run as multiple processes to enable faster run times.

In BC|Genome, all data and results (here barcode images, barcodes computed from the images, theoretical melting maps and barcode localization results) are stored into a database on the BC|Genome server. Data stored to the database is analysed in runs. A run is started from GUI that enables the selection of the input data (barcode images, barcodes, reference sequence, melting map database), and required parameters. Based on the data and the parameters, the system then splits each run into multiple background jobs to be executed in parallel in the server (and possibly any other available calculation servers). The user can monitor the progress of jobs and access their results from a centralized place.

NGS data analysis
BWA/GATK is a pipeline for aligning short read NGS sequence data against a reference genome, and for calling genotypes from the aligned data. Cell-O-Matic partner, BCP’s software implements interfaces for carrying out these analyses in three steps: BWA alignment, BAM file post-processing using GATK and Picard, and variant calling using GATK.

BC Platforms analysis and data management system needs to deal with large NGS data amounts, both in raw (FASTQ) and variation format (VCF, and BAM). For an NGS lab-on-chip -concept to be feasible, a very fast NGS analysis pipeline is required for receiving reliable results about variations in individual samples. In Cell-O-Matic the partners have constructed prototype modules for various stages of such NGS lab-on-chip for single cells. In other deliverables of this project the partners have been concentrating on the physical chip construct, fluidics control and cell capture, and extraction of NGS-quality single cell DNA. As the prototype aims for a fast analysis of multiple single cells in a blood or suspension sample, it is important to be able to perform the usually time-consuming computer analysis as quickly as possible.

An NGS data management construct called 'tiling' was created, which allows very rapid read-access to bulk NGS data, potentially including hundreds or thousands of individuals with millions variation points per individual. In order to analyse extracted variants in NGS data, fast parallel support for various tertiary analysis tools was implemented. These tools can be run in parallel on multiple calculation nodes to make variant analysis faster, and allowing multiple samples to be analysed simultaneously.

Sample data transfer
A metadata file structure that was generated automatically from the preliminary version of the chip-analysis controlling software was drafted. The content of the data file consisted of cell sample details provided by the user, and details about the user choices made during the imaging phase of cell capture. The sample management metadata and file structures were assumed to concern single-cell analysis only. Although one sample would consist of multiple cells, and the workflow includes steps for separating and identifying single cells within one sample, the actual end workflow deals with one single cell at a time.

The metadata structure had to accommodate the decision points in the sample workflow on chip. The decisions are based on visual inspection of images from the traps. Each decision point introduced either a splitting event for the sample, or elimination event for obsolete samples. These decision points create a sample ancestry for the final analysis, and this ancestry is recorded in BC|SAMPLE software.

BC|SAMPLE sample management module was extended to receive and handle sample processing data from multiple split events caused by user's decision to either accept or reject cell traps on chip. These structures are called Process tables, and they store sample-processing data in chronological order. The accumulating data can thus be viewed based on timescale, and special views can be created to combine the original sample data to latest recorded sample-processing data in these tables.

BC|SAMPLE sample processing API was developed based on simple object library that defines the structure of most of the processing system. Specific implementations exist for simple tasks like sample dilutions, aliquots, and replacing a sample type with another according to an arbitrary processing instruction.

Prediction algorithms and cloud computing
By studying literature and some published disease production models (including BOADICEA, IMI-SUMMIT and models by 23andMe.com) it became clear that very few models can achieve proper disease prediction using only sequence data. By adding other data, including clinical, biomarker, -omics, genealogy etc. to these models, significantly better prediction results can be achieved. Models based for sequence data only works for some monogenic disease diagnostics, not for preventive medicine. On the other hand, many models using rich input data can already provide clinical value.

To facilitate integration of various prediction models to our platform, we implemented 1) support for various data integration tools, 2) application programming interface (API) to access all data required by the model, and 3) ability to utilise cloud resources for analytical capacity.

We conducted functionality and performance tests between Amazon AWS (EC2 Ireland) and IBM SmartCloud (Germany and USA). We also made measurements using encrypted file system, required for storing health data in some countries. Our biggest concern for using the cloud for running our software was the performance of the file system and bandwidth issues with bulk data like NGS.

We developed command line tools for automatically launching new calculation node instances based on preinstalled AMIs (virtual server images in Amazon EC2). These AMIs contains required data analysis tools preinstalled.

WP7:
The outcome of the work performed in WP7 is to duplicate Cell-O-Matic prototype instruments which have proven to be well suited to control and monitor on-chip single cell processes; cell trapping and lysis, DNA extraction and DNA amplification.

The instrument also forms a suitable platform for developing a further instrument for single molecule analysis. The instrument has been designed around the microfluidic platform and the on-chip sample processes that have been developed during the project.

Instrument design and development
The work in WP7 has been broken down into design and development of individual instrument modules, which together with an instrument platform, forms the complete instrument. An ultrasound module was included in the development as one of other methods to disrupt cell walls to extract, aiding disentanglement and splitting DNA, but was not implemented in the final instrument.

A Peltier based temperature control module which is able to control the chip temperature during the entire sample process has been developed. The design includes a chip adaptor and a tube connector and has been refined through several steps in parallel with the progress in development of the chip and on-chip sample processes and the derived requirements here from. The design is shown in Figure 7.2.

The temperature module has been optimized and calibrated against the well temperatures of the chip to ensure accurate control of the sample process temperatures (Figure 7.3). With the final instrument it has been demonstrated that the design supports running the selected amplification process.

Fluorescence imaging module
A low cost epi-Fluorescence and Bright field imaging module for working with single cells and large amount of DNA has been developed. This module focuses on single cell applications but is also suitable for single molecule mapping applications providing the camera is replaced by an EMCCD camera and a more high-power microscope objective.
The development of the fluorescence imaging module have included building a complete fluorescence microscope from an instrument platform including illumination control, scanning the sample in x, y, and the z direction, etc. The design is shown in Figure 7.4

Fluid manipulation system
An air-based flow control module, including both hardware and software, has been developed by Fluigent. An industrial version (OEM) of the pressure regulator (Figure 7.5) has been developed including a Software Development Kit. A script editor has been developed to be able to run experiments in an automated way. The basic software functionalities for pressure control that can be used for the automation have been developed. The scripting tool thus enables the simultaneous control of all flow control and fluid handling tools.

The pressure control module (Fluigent, Figure 7.5) is externally connected to the Cell-O-Matic instrument, with the user control integrated into the GUI.

Ultrasound module
Sonication has been suggested as one of other possible solutions for disruption of cells, aiding DNA entanglement and splitting DNA into smaller pieces as preparation for DNA sequencing.
An ultrasonic module has been designed with the aim to be able to shear DNA in low volumes and meet the main requirement of targeting efficient ultrasound into specific areas of the chip without affecting other parts of the chip.
The module developed, comprises a piezoelectric element and an ultrasound transducer having a high electro-acoustical efficiency with low heat generation. The transducer is tuneable and can operate within a frequency range of 56-60 KHz thus enabling the level of energy to be adapted to the process it is to be used for.

The beam tip has a tailored design that ensures optimal connection to the chip. The small dimensions (beam tip is less than 1 mm) combined with the low heat formation makes it possible to target the ultrasound onto specific areas of the chip without affecting other areas of the chip, one of the main issues to be solved. The ultrasound module is shown in Figure 7.6.

Tests of genomic DNA fragmentation have been performed. By variation of frequency and power the possibility of fragmenting DNA in small (<1 kb) has been tested. It has been shown that fragmentation of DNA was induced at the frequency where the ultrasonic module delivers its maximum power, but also at higher frequency/lower power. Some results are shown in Figure 7.6.

It has been shown however that fragmenting DNA using ultra-sonication on the polymer platform was really challenging due to the acoustophoresis. Therefore, one of the alternative methods based on a photochemical process proposed in the DoW, was applied to achieve fragmentation.

Control Software
A stand-alone software application has been developed for image control and control of on-chip sample processes. Both Live view and a workflow view has been made part of the SW, which enables the user to control and monitor on-chip processes independently or in a more controlled manner by using a step by step process reflecting the specified sample process. An example of the user interface can be seen in Figure 7.7.

WP8:
WP8 was divided into 4 tasks: initial run of chip generated next-generation sequencing libraries, bioinformatics analysis to compare with traditional methods, fine-tuning of chip-based sample preparation system, and final run of chip-generated libraries in next-generation sequencing platform.

The first task “initial run of chip generated NGS libraries” aimed at characterization of the first single-cell data using basic bioinformatics analyses. In this task we performed on-chip single-cell capture, DNA extraction, and whole genome amplification. This material was then sequenced using next-generation technology (Illumina). Out of 8 samples received, libraries could be obtained from 4 samples. The prepared libraries were then sequenced using one lane of HiSeq 2x100bp paired-end sequencing for each sample. Bioinformatics analyses showed a significant improvement of data quality for on-chip single-cell DNA samples compared to off-chip single-cell DNA samples isolated and prepared using conventional techniques, with as much as 87% of genome covered.

For the next task, “bioinformatics analysis to compare with traditional methods”, we compared our on-chip single cell sequencing results to those reported in the literature. We achieved better coverage of the single cell genomes than those reported by Xie and co-workers (C.H. Zhong, S. Lu, A.R. Chapman, X.S. Xie: Genome wide detection of single nucleotide and copy number variation of a single human cell, Science 338 (2012) 1622-1626) with typically only 9-15 % non-covered bases for on-chip MDA amplification with our Cell-O-Matic device. Our results are compared to those obtained by Fluidigm, who obtain similar results e.g. 10 % of non-covered bases, for which they need to have an average coverage of 23-27x, whereas we achieve this at around 10x. (Figure 8.1)

In addition, methods have been developed to assess the allelic drop-out in our on-chip single sequencing data. For this we selected over 1.5 million high confidence heterozygous variants identified from our bulk reference. We found that the allelic dropout rate was very dependent on the average coverage of the sample and obtained rates as low as 24%.

The task “fine-tuning of chip-based sample preparation system” aimed at improving the Cell-O-Matic system with regard to ease of use, and performance. For this task, several improvements including chip design, fabrication, instrument controls and software were made and assessed in the final task. In addition, new library preparation methods “Illumina TruSeq Nano” were tested. Initially, the kit selected for library preparation was the “Illumina Nextera DNA” kit. This library preparation allowed us to reduce the necessary starting DNA material to only 10 ng. However, enzymatic shearing of the DNA was requested, a process that is thought to induce sequencing bias in the generated data. The “Illumina TruSeq nano” kit requires 100-200 ng of starting material but is supposed to reduce this sequencing bias due to the use of mechanical shearing instead of enzymatic shearing of the DNA. We prepared samples in parallel using the two library preparation strategy without finding significant differences in terms of sequencing data quality compared to “Illumina Nextera DNA” libraries.

The final task of WP8 consists at processing single-cells using the Cell-O-Matic device after it has been optimized at different levels as described above. Fasteris received a total of 76 single cells samples all isolated lysed and amplified using the Cell-O-Matic instrument. DNA amplified from single cells from cell lines (LS174T, LS180) as well as primary cell cultures was used as starting material. Library preparation was successful for 97% of the samples received, showing a significant improvement of the DNA extraction and amplification procedure.

All libraries sequenced were then mapped against the human genome GRCh37. Mapping data shows an average fraction of mapped reads of 98%, indicating that contamination of the samples from non-human sources are not present. This is a significant achievement as several reagent induced contamination are reported in the literature.

In terms of coverage of human genome, we found that libraries could be separated into 3 groups: good coverage of human genome (9-15 % of non-covered bases), moderate coverage of human genome (15-90% of non-covered bases), and bad coverage of human genome (90-100 % of non-covered bases). Bad representation of human genome is thought to be caused by the loss of genetic information during the DNA extraction process in the Cell-O-Matic instrument, as confirmed by increasing the sequencing power for a subset of samples which did not result in significant coverage improvements. However, about 30% of the samples sequenced reach lower than 15% of non-covered bases indicating that the process happening in the chip went according to the best standards and most of the genetic material were recovered following whole genome amplification. These results also shows improvement compared to other lab-on-a-chip devices with similar performances reached at lower sequencing power.

Finally, we analysed the allelic dropout. We were able to obtain as low as 30% of allelic dropout rate at an average coverage of 15-20x. The allelic dropout is also expected to decrease significantly when increasing the sequencing power for these samples.

WP9:
Thus far the multitude of mutations (SNPs and indels) found in DNA sequencing of tumours makes it very hard to distinguish between relevant mutations and those which are mere passenger mutations. This creates an enormous barrier to use sequencing in practice. To aid Medical Doctors (MDs) in clinical practice through clinical decisions support (CDS), Philips Research has developed a new approach to interpret sequencing data.

The approach adopted by Philips is to determine, based on what target genes are expressed rather than what potential SNPs/indels are present in the DNA of the cell, which of the oncogenic pathways are activated and involved in the cancerous growth. A Bayesian computational model using expression levels of target genes to determine pathway activation was first constructed for this Wnt pathway (see Figure 9.1) building on world-renowned expertise of the Clevers’ group (Hubrecht Institute, Utrecht) (W. Verhaegh et al. Cancer Res 74 (2014) 2936). Now we have Bayesian models to assess pathway activity for 7 out of the approx. 10 oncogenic pathways. For 6 of these, the pathway models have been ported (calibrated) to work from RNA sequencing data.

Very interesting results were obtained when using the same Bayesian approach for the Estrogen Receptor (ER) pathway and applying this model to a series of breast cancer samples judged to be ER positive based on receptor protein presence. The Bayesian pathway model predicted that, despite the ER receptor presence, the ER pathway was not activated in all cases; see Figure 9.2 below. Significantly, this led to a reclassification of some of these patients, which also lead to a separation in two groups with a significant difference in the 5 year survival rates (W. Verhaegh et al. Cancer Res 74 (2014) 2936). This indicates that these patients, classified on pathway activity (based on expression data), should have been given a different therapy. This result could thus have direct clinical relevance.

The first benefit of this will be that MDs will have a tool to assess from sequencing which pathway is relevant and could interpret sequencing data and thus to avoid giving the wrong drugs. When deciding on what specific drug to give these data should be complemented by DNA sequencing data to determine at which point in the pathway the disruption occurs so that a suitable targeted drug can be given to a patient. Furthermore, data from single cells processed in the Cell-O-Matic polymer devices operated on the Cell-O-Matic prototype is being evaluated.

WP10:
This work package focused on dissemination and exploitation issues for the Cell-O-Matic. See more below.

Potential Impact:
WP2:
The Cell-O-Matic project will have a clear beneficial impact in several areas including: basic research, nation’s health as well as the commercial private sector. The proposed microfluidic method for sample preparation from single cells is simple and fast, which might be of significance in future genome-based healthcare applications, most obviously in the cancer field. Single cell genomic analysis at a manageable cost to detect tumour heterogeneity and genetic characterisation will be an important aid to diagnosis and prediction of drug response, and subsequently to improving targeted cancer treatments. Combining pinched flow fractionation devices with trapping chips may lead to development of a novel technology: single cell genomic analysis from circulating tumour cells, for which there is a clear need in the clinical management of cancer patient treatments. Furthermore, analysis of the cellular composition of the sputum from asthma patients may help to identify patients who do not respond well to steroid treatment and enable other treatments to be offered. Similarly, analysis of response to immune therapies and vaccination on a single cell level will help in the understanding the nature of the immune responses and so, for example, help to counteract the inhibitions that can limit therapeutic success. Another example, related to the study of cellular differentiation, would be the analysis of singe cells during the process of differentiation of induced pluripotential (iPS) cells. These techniques are being widely studied in the search for novel cell and tissue therapies.

Main dissemination activities:
• Pødenphant M, Ashley N, Koprowska K, Mir KU, Zalkovskij M, Bilenberg B, Bodmer W, Kristensen A, Marie R. Separation of cancer cells from white blood cells by pinched flow fractionation. Lab Chip. 2015; 15: 4598-606
• Ashley N, Jones M, Ouaret D, Wilding J, Bodmer WF. Rapidly derived colorectal cancer cultures recapitulate parental cancer characteristics and enable personalized therapeutic assays. J Pathol. 2014; 234: 34-45
• Mouradov D, Sloggett C, Jorissen RN, Love CG, Li S, Burgess AW, Arango D, Strausberg RL, Buchanan D, Wormald S, O'Connor L, Wilding JL, Bicknell D, Tomlinson IP, Bodmer WF, Mariadason JM, Sieber OM. Colorectal cancer cell lines are representative models of the main molecular subtypes of primary cancer. Cancer Res. 2014; 74: 3238-47
• Ashley N, Yeung TM, Bodmer WF. Stem cell differentiation and lumen formation in colorectal cancer cell lines and primary tumors. Cancer Res. 2013; 73: 5798-809
• Bodmer WF. Series of talk on ‘Genetics and biology of colorectal cancer: drug responses, cell lines, stem cells and differentiation’ in the UK, USA, China, Greece, 2014-2015
• Ashley N. Jones M, Bodmer W. Polarized expression of the multidrug efflux protein P-glycoprotein is inversed in primary colorectal spheroid cultures. EMBO Workshop: Unravelling Biological Secrets by Single-Cell Expression Profiling, 25-26 September 2014, Heidelberg, Germany
• Pødenphant M, Ashley N, Bodmer W, Beckett J, Mir K, Marie R., Kristensen A. Injection moulded pinched flow fractionation device for cell separation. 39th International Conference on Micro and Nano Engineering, 16-19 September 2013, London, UK

WP3:
The potential impact of the single DNA molecule mapping work is in genomics in general but single cell genomics in particular. The work will have particular relevance to cancer because tumours tissue is known to be heterogeneous in cellular genomic composition and the methods developed in this work package will enable the identification of the different structural variants present therein as well as providing long-range haplotype information.

Providing single cell genomic analysis at a manageable cost to detect haplotype resolved genome structural heterogeneity within tumours may become an important aid to managing cancer in terms of diagnosis, staging, prognosis, and personalized treatments. Machine vision-based automation of the handling of single megabase-scale genomic molecules in microfluidic devices has the potential to impact synthetic biology as well as DNA mapping and sequencing.

The socioeconomic impact of the work will be in the development and sale of products, which will employ technical and business personnel and will bring revenue to the companies selling the products and the universities licensing IP. The wider societal impact will be in healthcare, where the advancement of highly sensitive and highly informative assays based on single molecule analysis will impact a variety of diseases, particularly cancer.

Some of the results of the work package have already been disseminated by publication and by talks at conferences and symposia, including at the mapDNA workshop organised by Cell-O-Matic. The final results of the work package will be disseminated via publications that are currently in preparation and, by giving talks and lectures and media interviews.

The results of the work package will be exploited by the academic and commercial partners within the consortium, who will use the results as a basis for developing a platform to carry out highly effective studies on structural variation and its impact on phenotype. The results of the work package will also be exploited in the development of single cell genomics and DNA mapping products which, when commercially released will be exploited by the research community.

Main academic dissemination activities:
• MapDNA workshop in Denmark, October 2015.
• Marie, R. et al. “Integrated View of Genome Structure and Sequence of a Single DNA Molecule in a Nanofluidic Device.” PNAS 110.13 (2013): 4893–4898
• Pedersen JN, Marie R, Kristensen A, Flyvbjerg H, Physical Review E, accepted (2016)
• Sørensen, KT et al. “Automation of a Single-DNA Molecule Stretching Device.” Review of Scientific Instruments 86.6 (2015): 063702
• Pedersen, JN. et al. “Thermophoretic Forces on DNA Measured with a Single-Molecule Spring Balance.” Physical Review Letters 113.26 (2014): 268301
• Rodolphe Marie, series of invited talks on Single molecule D-R mapping in the cross-flow nanofluidic device.
• Jonas N. Pedersen, R. Marie, D. L.V. Bauer, K.H. Rasmussen, M. Yusuf, E.V. Volpi, A. Kristensen, K.U. Mir, and H. Flyvbjerg, Fully Stretched Single Dna Molecules In A Nanofluidic Chip Show Large Scale Structural Variation, (poster presentation) Biophysical Society Annual Meeting 2013, (Philadelphia, PA, US).
• Jonas N. Pedersen, Rodolphe Marie, A. Kristensen, and H. Flyvbjerg ‘How to determine local stretching and tension in a flow-stretched DNA molecule’, (oral presentation) APS March Meeting, 2016 (Baltimore, MD, US),

WP4:
The final result of WP4 in Cell-O-Matic is a demonstration that complex lab-on-a-chip devices for single cell analysis can be produced by scalable industrial processes that can compete with existing production technologies such as silicon fabrication used in state-of-the-art commercial instruments (BioNanoGenomics – optical mapping) or elastomer casting (Fluidigm – single cell genomics). Potential impact is the adoption of such mass production method by the European industry for the production of micro and nanofluidic devices for single cell analysis as polymer consumables for research and diagnostics use

Dissemination activities relevant to WP4 include NILT and DTU hosting the Microfluidic network meeting MF5/6 in 2015. This network is gathering companies with an interest in microfluidics.

Main academic dissemination activities:
• N. Hansen et al, ‘Multifunctional nickel shims with sub-micrometer resolution for microfluidics and lab-on-a-chip’, Poster presentation at NanoBioTech Montreux 2015
• T. Nielsen et al, ‘Injection molding of all-plastic lab-on-a-chip’, NILT exhibition at MRS 2015 fall meeting, booth 1209

WP5:
The work performed in WP5 played a supportive role in the development of single cell genome sequencing involved in the other parts of the Cell-O-Matic project. Single cell analysis offers great promise in efficient assessment of cancer tissue genetic variability. This variability is crucial to understand the mechanisms underlying disease progression and response to treatment.

WP5 can be associated to dissemination activities that occurred, as for instance in the publication Passive Lab-in-a-Foil Device for Multiple Displacement Amplification of DNA from Eriksen et al. (manuscript under peer review) where isothermal genome amplification methods benchmarked during the Cell-O-Matic were used.

WP6:
The ability to analyse single cell samples fast and precisely is one key element in developing tools for early diagnosis of various conditions like cancer. The precision of the single cell analysis methods is also crucial for identifying tumour heterogeneity at various time points during cancer treatment. Therefor attention to single-cell and lab-on-chip technologies has been on the rise, and continues to be so. This requires databases and software and WP6 will have impact by providing software modules to BC|Genome.

The main focus was in DNA analysis, mainly in the fast identification of structural variations in the target cell's genome, and determining the genetic identity of the trapped cells. We demonstrated that it is possible and very feasible to capture and identify successfully single cells based on their genetic make-up, plus we could produce accurate data of genomic restructurings (copy number variations and other reorganisations) reliably and fast.
It is completely possible to combine a lab-on-chip instrument interface with a fast-delivering analytical platform, to receive results within tens of minutes after processing the samples on the chip. The chain of information from laboratory to the analytical platform and back can be continuous, and thus preserve vital data integrity.
The same sample pipeline can be used for more elaborate analytics. In case of tumour identification it is possible to append the analytical data from the chip with other data available for the patient, and combine these successfully into a predictive algorithm, which could be directly used to aid in diagnostics and decision making.

The algorithms developed in D6.1 are the core of the data analysis in [Marie et al PNAS PNAS 110, 4893 (2013)].

Several modules for commercial exploitation by Cell-O-Matic partner, BCP based on the tasks in this project were developed. These include BC|ALIGN for NGS alignment and variant calling work, BC|MERGE for biomarker and other data integration for the use of prediction algorithms in clinical context, and BC|TILING to create more efficient and faster NGS and other genomic data analysis, and finally additions to BC|RESOURCE -module in the form of new cloud based calculation and other resources from various cloud service providers. These modules will have impact once they are taken up by the research community.

WP7:
Training workshops concerning single cell sample preparation and use of the instrument have been held with personnel from internal partners, as well as the whole consortium.
The system, and the single cell and single molecule applications have been presented for professional target groups during workshops and conferences.

The training in the use of the Cell-O-Matic system has covered:
• Use of the instrument and the controlling software
• Use of the chip
• Sample preparation protocols (lysis, staining, extraction)
• Amplification protocols
Training materials:
• System guide (manual)
• Protocols

A major output of the Cell-O-Matic project is combining fluidic chips developed within Cell-O-Matic with the Cell-O-Matic instrument developed in WP7 which has proven to be very suitable for controlling and monitoring the on-chip processes, we have produced a relatively simple and inexpensive system that can do single cell DNA sample preparation for, for instance, subsequent sequencing, at relatively modest costs.

The technologies developed, combined with the interesting results, has significantly added to our knowledge and interest in exploiting this field further. The outcome of the project has formed a starting point for Philips BioCell (former UNI) to exploit possibilities for development of commercial products within single cell research and personalized cancer treatment. During the project a significant amount of knowledge has been build and transferred between partners, which have provided valuable input for our continuous work. We expect to continue the fruitful cooperation in various projects.
In addition we also see opportunities in commercial use of the low-cost fluorescence imaging module developed, and already now the technology is included in our considerations for future development of our product portfolio.

WP8:
WP8 has delivered outstanding results considering Cell-O-Matic is still at the stage of a prototype. We have shown that we were able to deliver results at least as good as the current state-of-the-art in single-cell sequencing and data analysis. Accordingly, a first application is reported through WP9 activities. It can be foreseen that many other applications could be developed with more efforts put into making the prototype into a commercial product that could answer a need from both research and diagnostic clients.

One impact is that the results generated within WP8 are being used as part of the know-how of Fasteris for the service they provide to clients. We have been able to increase our knowledge of the limits imposed by the library preparation kits from Illumina in terms of minimum amount of starting material. For example, Illumina states that 50 ng of starting material is needed for their Nextera DNA library preparation, while we found that starting with only 10 ng did not impact significantly the sequencing data. We also have a better understanding of the limitation of single-cell sequencing (coverage breadth, coverage uniformity, allelic dropout rate). We have also put in place bioinformatics pipelines that can be re-used and improved for future potential projects. All of these elements allow Cell-O-Matic partner Fasteris to welcome projects involving single-cells with a strong internal know-how.

WP9:
Thus far the multitude of SNPs found in DNA sequencing of tumours made it very hard to impossible for MDs to distinguish between relevant mutations and those which are mere passenger mutation. To aid MDs in clinical practice through clinical decisions support (CDS), Philips Research has developed a new approach to interpret sequencing data, in which the analysis of sequencing data is based on expression of the relevant target genes to assess which biological pathway is activated and driving tumour growth. Through computational Bayesian modelling approach from expression of a set of target genes an assessment/prediction is made, which of the developmental biological pathway is driving tumour growth (W. Verhaegh et al. Cancer Res 74 (2014) 2936).

This approach may have a first (clinical) success in explaining why some 30% of the breast cancer patients, despite expressing the Estrogen Receptor (ER) do not respond to ER drugs, as in that case (as our analysis shows) the ER pathway is not the tumour driving pathway (see W. Verhaegh et al. Cancer Res 74 (2014) 2936). Philips Research is further developing this approach in covering all relevant oncogenic pathways, such that MDs will have a model/tool to assess from sequencing which pathway is relevant and could interpret sequencing data. Note that that these data should be complemented by DNA sequencing data to determine at which point in the pathway the disruption occurs so that a suitable targeted drug can be given to a patient. Furthermore, data from single cells processed in the Cell-O-Matic polymer devices and the on the Cell-O-Matic prototype could lead to better understanding of cancer mechanisms and better prediction of outcomes.

Philips is currently working in further clinical trials to test the Bayesian computational pathway modelling approach with the aim to bring these tests to the market.

As a result of this project Philips Research and Philips BioCell, in particular, are exploring this option further for development and potential market introduction. This activity has led at Philips BioCell to the hiring of a person, who obtained her Ph.D. in the project.

In summary, the work in this project has led to:
• Two papers being published and a further manuscript currently being written.
• One patent application
• Two possible products Philips is further investigation (pathway models and CTC diagnostics) for potential market introduction.
• A tool for doctors to at least better assess, rather than just relying on the mere presence of a receptor, whether the related pathway is actually active.

WP10:
The Cell-O-Matic project has made several individual innovations which together have taken shape to provide the project’s output as a coherent whole. The consortium decided early in the project to focus application of the technologies, as they developed, on sample material relevant to colorectal cancer. Although the project’s initial remit was to focus on DNA analysis, the project evolved with developments in the field and its own developments to extend its remit to the analysis of RNA as well.

The main dissemination activities during the project have included:
1. Two Workshop/Symposia Events- One on single cell analysis and applications and another on DNA mapping.
2. Twentyone Scientific Publications- range from instrumentation development for DNA nanofluidics to computer models based on single cell analysis for personalized therapy.
3. Thirty talks at conferences, workshops, symposia.
4. Nine instances of media exposure
5. Four Patents filed

A number of original research articles were prepared for publication to exploit the results of the work as outputs measureable by peer-reviewed acceptance into journals and via future citations of the work.

The technology developed and IP protected is highly responsive to the needs of single-cell analysis, a research methodology that is becoming increasingly more important in biology and medicine, in the cancer field in particular. Providing single cell genomic analysis at a manageable cost to detect tumour heterogeneity and genomic characterization of cancers will be an important aid to diagnosis and prediction of drug response, and to improving targeted cancer treatments. Any analysis of processes of differentiation using cellular systems will be greatly helped by the single cell genomic analysis enabled by the Cell-O-Matic instrument. This will, for example, include the analysis of differentiation induced in IPS cells. It will also aid the characterization of cell types present in mixed sources of cells from human biopsies, for example from cancers, in sputum from asthma patients or in biopsies from patients with various types of irritable bowel disease. There will also be impact for pre-implantation diagnostics/screening and as well as the analysis of circulating tumour cells (CTCs) the analysis of maternally-derived foetal cells (MDFCs) in blood. In addition, there are specific markets where only one or a few cells are available for analysis where the tools and knowhow of Cell-O-Matic will be productively applied.

The expertise gained in micro-nanofluidic device manufacture will enable other similar products to be developed. The expertise gained in bioinformatics and sequencing of material extracted from single cells would be able to be applied to a range of similar projects.

As such the technical developments within Cell-O-Matic will have deep and wide-ranging impact. It is clear that there will be immediate and medium-term economic impact on a number of products and services for the burgeoning market surrounding next generation sequencing of single cells and for DNA mapping; with further development, the Cell-O-Matic outputs are also poised for being impactful for the treatment of cancer. Figure 10.1 shows a roadmap for impactful publications, patents, services and products envisaged by the Cell-O-Matic. Moreover, the Cell-O-Matic project will benefit Europe’s citizens by contributing to future healthcare systems based on single cell genome sequencing and mapping of individuals and periodic single cell monitoring, ushering in a new age of preventative, predictive and personalized medicine.

List of Websites:
www.cellomatic.eu
Contact: Professor Anders Kristensen, DTU Nanotech, Oersteds Plads 345e, 2800 Kgs. Lyngby
anders.kristensen@nanotech.dtu.dk

Final Report Summary - CELL-O-MATIC (High Throughput Systematic Single Cell Genomics using Micro/Nano-Fluidic Chips for Extracting, Pre-analysing, Selecting and Preparing Sequence-ready DNA)

Related documents

Share this page

Download