European Commission logo
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-06-18

Statistical methods for 3D imaging mass spectrometry in proteomics and metabolomics

Final Report Summary - 3D-MASSOMICS (Statistical methods for 3D imaging mass spectrometry in proteomics and metabolomics)

Executive Summary:
Three-dimensional (3D) imaging MS is a technique for 3D molecular analysis of tissue specimens and agar samples that is gaining popularity and recognition. However, analysis of big and complex data generated by 3D imaging MS requires new statistical methods. The European FP7 project 3D-MASSOMICS, uniting 9 partners from 6 countries, was conducted to bridge this gap, with the objectives: (1) to develop statistical methods for analysis of variation in imaging MS data that is crucial to improve the data and results reproducibility and (2) to create novel methods for unsupervised and supervised statistical analysis. The project vision was to make 3D label-free proteomics and metabolomics possible and practical by providing necessary statistical methods to a biochemist, biologist, or clinical chemist and through this enable the use of 3D imaging MS in biology and medicine. This vision formulated in 2011 was by and large successfully fulfilled during the project in 2013-2015.
The project was highly successful, with 29 journal publications published including 3 publications in PNAS, one of the most influential journals. Our publications were among top-viewed and highlighted in Nature, PNAS, CNN, Der Spiegel, Wired, Scientific American, NIH Director’s Blog, ScienceWorld, Health, FOCUS Online, Die Welt, Chemical & Engineering News (ACS), Business Insider, MedGadget, Profil, Boston Globe, Shape Magazine, Eos magazine, Refinery29, CosmeticsDesign-Europe.com Fast Co.Exist Business Standard (India), and ZeeNews (India). Our visualization was selected as an Image of the Year by the Nature journal aside photos of Pluto and California burning. We have published over 300 GB of open-access data at the public repository MetaboLights. We have released `ili, an open-source Google Chrome app for 3D surface imaging visualization already used by 98 users and working on a roadmap to integrate methods into commercially available softsare package SCiLS Lab 3D. We have involved 97 experts in imaging MS into a crowdsourcing study on the quality of imaging MS . We have organized 2 workshops and 1 conference. For the latter , we teamed up with European BMBS COST action BM1104 on Mass Spectrometry Imaging and attracted 197 participants from 20 countries. On the formal side, we carried out 6 project meetings, delivered 8 deliverables, achieved 3 milestones, performed 5 internal and 10 external trainings.
For more information, please visit our website http://3D-MASSOMICS.eu or contact the coordinator Theodore Alexandrov at theodore.alexandrov@embl.de or at EMBL, Meyerhofstr. 1, 69117 Heidelberg, Germany.

Project Context and Objectives:
Concept and project objectives
3D imaging. Since biology is by and large a 3-dimensional phenomenon, it is hardly surprising that 3D imaging has a significant impact on many challenges in life sciences. Current 3D imaging technologies (various types of CT, MRI, ultrasound, autoradiography, PET, SPECT) are labelled, i.e. they trace the localisation of a specific compound in the body. They are capable to either display body or organ anatomy by tracing water- or other fluid-molecules, or to localize a particular labelled compound. Unfortunately, they are much less useful in proteomics or metabolomics discovery studies aimed at finding new biomarkers, drugs, and disease pathways.
2D MALDI-IMS. Mass spectrometry (MS) represents in one spectrum hundreds of compounds ranging from metabolites to proteins, hence, it is a perfect discovery tool. On the other hand, conventional MS, as well as GC/MS and LC/MS, requires sample homegenization that discards spatial information of the sample. 2D gel electrophoresis, which is the standard technique in proteomics, suffers from the same problem. This problem has been solved by the development of matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI-IMS), an emerging MS-based label-free imaging technology . Several proof-of-principle studies have shown its high potential in -omics discovery studies, see recent reviews . 2D MALDI-IMS has proven its value in the last decade in metabolomics, glycomics, lipidomics, peptidomics, and proteomics . MALDI-IMS reveals the spatial localization of molecular ions and serves as a superior discovery tool in addition to existing MS-based techniques, and as a label-free method to image the spatial distribution of molecular compounds, thereby complementing imaging labeled methods, immunohistochemistry, and genetics-based methods, like in situ hybridization.
3D MALDI-IMS and the project motivation. 3D MALDI-IMS is based on 2D MALDI-IMS and inherits its advantages over other imaging modalities . It is a label-free, highly sensitive, semi-quantitative technique for a wide range of biomolecules from endogenous and exogenous metabolites and drugs to proteins, and can be combined with MS/MS for subsequent identification of biomolecular species. However, 3D MALDI-IMS cannot tap its full potential due to the lack of statistical methods for analysis of large and complex 3D data generated.

The 3D-MASSOMICS project was proposed and conducted to fulfil the following key objectives: (1) to introduce statistical methods for studying variation of MALDI-IMS data in order to develop reproducible data acquisition protocols and (2) to develop and evaluate statistical methods for un- and supervised statistical analysis of 3D MALDI-IMS data.
The developed methods were planned to be evaluated using synthetic data and data from the following biomedical applications: diabetes, surgical metabolomics, and microbial metabolomics.
The project vision was to enable 3D label-free proteomics and metabolomics by providing statistical methods for analysis and interpretation of 3D MALDI-IMS data.

Key issue. By 2012, 3D MALDI-IMS has reached the ideal state of maturity for the proposed project. In 2000-2010, 2D MALDI-IMS has made rapid progress; many technological problems have been solved. Recently, several research groups addressed a critical issue of preparation of fragile serial sections and proposed cheap, robust, fast, and IMS-compatible sectioning protocols. The acquisition time was significantly improved by introducing new mechanics and fast lasers. By 2012, the key issue was the lack of statistical methods for analysis and interpretation of large data generated by 3D MALDI-IMS. Even a 2D MALDI-IMS dataset easily exceeds 1 gigabyte in size, typically comprising 5.000-50.000 spectra of approximately 10.000 bins length. A 3D MALDI-IMS dataset consists of a few tens of 2D datasets of serial sections, which increases its size by an order of magnitude, being in the order of 100 gigabytes per dataset.
Evaluation. The evaluation of developed methods was considered an important part of 3D-MASSOMICS. We proposed to exploit statistical evaluation using synthetic data and expert evaluation using real-life data. Firstly, we planned to develop a statistical simulator of MALDI-IMS data and simulate synthetic data to be used as a gold standard for statistical evaluation. Secondly, the co-applicants with biomedical expertise were planned to provide support during methods development, to perform relevant data collection, and to carry out expert evaluation. This would ensure relevance of posed problems, immediate feedback on the value of the developed methods in applied areas, and rapid uptake of developed methods in practice.
Impact. In summary, the proposed work was expected to enable and improve applications of 3D MALDI-IMS in biochemistry, biomedicine, and molecular biology. In a broader sense, all fields where 3D MALDI-IMS can be of use would benefit from 3D-MASSOMICS.
Other statistical developments. By 2012, to the best of our knowledge, there were no other projects dedicated to the development of statistical methods for 3D MALDI-IMS. The existing by then 2D MALDI-IMS software packages (AB SciEx: TissueView, Bruker: FlexImaging, Thermo Scientific: ImageQuest, Waters: HDI Software) did not support 3D imaging.
Vision. It was our vision to enable 3D label-free proteomics and metabolomics by providing statistical methods for analysis and interpretation of 3D MALDI-IMS data.

Detailed project objectives
In order to reach the main goals of the project, we planned to fulfil the following five objectives:
Objective 1: Development of preprocessing, unsupervised & supervised statistical methods
• Development of preprocessing methods → algorithms for 3D m/z-image denoising and highly sensitive peak picking
• Development of unsupervised methods→ algorithms for spatially-aware data representation and selection of relevant 3D m/z-images
• Development of classification methods → algorithms of 3D spatially-aware classification of spectra and search for discriminative m/z-values
Objective 2: Synthetic data simulation & statistical evaluation
• Development of 3D MALDI-IMS statistical simulator → simulator, synthetic data
• Development of methods for evaluation of statistical methods → evaluation strategies
• Integration of evaluation methods into development workpackages → helpdesk
Objective 3: Efficient implementation
• Efficient GPU implementation of developed statistical methods → GPU algorithms
Objective 4: Statistical analysis of data variation
• Development of statistical approach to analysis of data variation using linear mixed-effects model → methods for analysis of variation
• Development of MALDI-IMS protocols preventing proteomics degradation → new sample preparation protocols
Objective 5: Data collection, expert evaluation, and proof-of-principle applications
• Data collection for reproducibility studies → data acquired with different settings
• Data collection for the selected biomedical challenges → data
• Expert evaluation of biomedical data analysis results → expert evaluation
• Proof of principle applications to diabetes, natural products, and surgical metabolomics → analysis results

Overview of the state of the art by 2012
2D MALDI-IMS is a label-free bioanalytical technique that can capture spatial distribution of hundreds of molecular ions in a single measurement while maintaining the sample molecular integrity. By 2012, 2D MALDI-IMS became one of the most promising innovative measurement techniques of biochemistry. In -omics studies, it served as a powerful tool for spatial chemical analysis of diverse sample types ranging from biological and plant tissues to bio and polymer thin films. For reviews of technological principles and protocols used in MALDI-IMS by 2012, see the special issue of Methods in Molecular Biology and recent surveys .
Statistical methods for 2D MALDI-IMS. An ultimate aim of a statistical analysis for MALDI-IMS is to find m/z-values corresponding to ions of interest. These ions can be discriminative for a spatial region or express differences between two spatial regions of one sample or between two different samples, e.g. be discriminative for a tumor region. Once ions of interest are revealed with MALDI-IMS, they can be identified using MS/MS-based metabolomics or proteomics identification methods.
Numerous statistical methods for 2D MALDI-IMS were developed to solve such important problems like spatial segmentation, component analysis, and classification . Nevertheless, several key issues such as statistical analysis of data variation, statistical data simulation and objective evaluation of results were left mainly open by 2012.
3D MALDI-IMS was an emerging 3D IMS technology based on 2D MALDI-IMS. Most common protocol for 3D MALDI-IMS was to (1) section a sample into serial sections, (2) measure each individual section with 2D MALDI-IMS, and (3) merge individual 2D datasets into one 3D dataset respecting the original spatial relations between serial sections. The first proof-of-principle was reported in 2005 . Then two technological papers were published in 2008 in Nature Methods and one in PNAS in 2009 . Since then, 3D IMS attracts attention at the conferences (ASMS, IMSC, DGMS) and was recently called “A New Frontier” in mass spectrometry . The methodological papers on 3D MALDI-IMS unanimously state that a key issue in the current state of the art is the lack of statistical methods for mining huge and complex 3D IMS data.

Project Results:
The most significant results achieved in the project:
Several innovative scientific methods were developed. Among many methods developed and published in some of 29 journal publications, we selected two methods to highlight here. The first one is the measure of spatial chaos developed and applied for automated detection of spatially-informative mass-to-charge images . We consider this method especially innovative because it represents the first unsupervised method for univariate analysis, considering only one mass-to-charge image at a time that makes it applicable to a dataset of any size that is especially important for high mass resolution imaging MS where a dataset contains millions of mass-to-charge channels. The second method is a new workflow for 3D surface analysis that was published in PNAS of Bouslimani et al. (2015). The paper attracted great attention, was cited 15 times since Apr 2015 (as of Dec 2015), was highlighted in more than 20 media including Nature, PNAS, CNN, Der Spiegel etc.; a figure from the paper was selected by the Nature journal as an Image of the Year.
Open-source software package was released. EMBL has developed `ili, an open-source Google Chrome App for molecular mapping in 2D and 3D (https://github.com/ili-toolbox/ili) that is freely available on Google Web Store and already has 98 users.
Roadmap for commercial software. The project partner Steinbeis Center SCiLS Research will work together with SCiLS GmbH (Bremen, Germany) to provide the developed methods and implementations to potential users as a part of commercially available package SCiLS Lab 3D.
Involvement of community / open-access data. OurCon3, the joint final symposium of 3D-MASSOMICS and BMBS COST action BM1104 on Mass Spectrometry Imaging attracted the record number of 197 participants from 20 countries. As a part of the paper Palmer et al. (2015) , we have involved 97 experts in imaging MS to create a gold standard for quality of imaging MS data. The results as well as the constructed gold standard were published open-access in the paper. In Oetjen et al (2015) , the consortium published the key selected 3D datasets in the GigaScience journal. Since its publication in May 2015, the paper was already cited several times and viewed more than 4500 times (as of Dec 2015). We have deposited almost 200 DESI-Orbitrap datasets (200 GB) to MetaboLights (accession numbers MTBLS273, MTBLS289, MTBLS282). This release represents the largest known public collection of high-resolution imaging MS data.
Another European project was enabled by the results of 3D-MASSOMICS. A new European H2020 project METASPACE (http://metaspace2020.eu coordinated by EMBL) was enabled through the results obtained in 3D-MASSOMICS and the collaboration established this project.

We have structured the rest of this section according to the work packages. For each work package, we first provide a brief summary first that includes Summary of progress and Significant results.

WORK PACKAGE 1 “Synthetic data simulation & statistical evaluation”
Summary of progress towards objectives
WP1 concerns data collection for the development of a statistical simulator and statistical evaluation of computational methods using synthetic data. In Task 1.1 we developed a statistical simulator of MALDI-TOF imaging MS data from an anatomical model taken from the Allen Rat Brain atlas and converted it into an internal format. The development was performed based on a previously published Time-of-flight model and using yet described spectrum features. In Task 1.2 we simulated a large 3D MALDI-TOF imaging MS data set. The choice was made of a 100 µm resolution that represents a sufficiently large number of voxels to discriminate anatomical details, whereas producing reasonably small datasets for further data exchange, storage and processing between partners. The simulation model was adjusted to fit better with proteomics applications and we have produced nine datasets that correspond to different simulated instrument parameters. The synthetic datasets were uploaded to the project FTP server for sharing with other partners. They will be used in the development work packages (WP4 - WP7). In Task 1.3 we developed statistical evaluation using simulated data. Unsupervised and supervised analyses were concerned. Spatially aware supervised classification methods were developed, which have been applied to 2D DESI MS imaging data (see in WP6). For the cross validation approaches, a leave-region-out methodology has also been developed. Interestingly, a stratified cross-validation method is now available in the new version of Matlab and will be used further in the project. Methods were also developed for unsupervised analysis. We have considered several characteristics that were evaluated when comparing results of spatial segmentation to the original annotation. We selected ‘Balanced accuracy’ (mean value between the sensitivity and specificity), as it is not biased for imbalanced datasets. Indeed, in imaging MS, the number of pixels is high and the sizes of clusters resulted after annotation of spatial regions or after spatial segmentation can be disproportional.
Significant results
We succeeded in developing the first statistical simulator of MALDI-TOF imaging mass spectrometry data on a rat brain model. Simulated datasets are now available to all partners through the project FTP server, but we expect these datasets to be used by a large community of scientists in the field of imaging MS. A manuscript is in preparation that should strengthen our dissemination on this topic. We also succeed in developing statistical evaluation methods using our simulated datasets. Our spatially-aware supervised classification method, applied to 2D DESI MS imaging data so far, is developed to be easily extended to incorporate 3D imaging MS data. The balanced accuracy-based method developed for unsupervised analysis is rather original and innovative and it could be considered a milestone for spatial segmentation and further selection of molecules of interest in 3D MS imaging. A manuscript is in preparation to support the uptake of this approach by the community.
WORK PACKAGE 2, “Data collection & expert evaluation”
Summary of progress towards objectives
WP2 aims at, first, providing data necessary for evaluation of methods to be developed in other work packages and, second, at involving experts in the application fields to evaluating methods and results of methods application. Task 2.1 is concerned with collecting specific data for a reproducibility study. Task 2.2 is concerned with collecting data coming from different fields that is to be provided to all project participants for general methods evaluation. Task 2.3 is concerned with expert evaluation with main activities planned in the second half of the project.
In Task 2.2 the project partners were collecting data in particular by using novel workflows developed in the project. Overall, this part of the project was successful with the key selected datasets published by Oetjen et al. in GigaScience. The publication raised considerable attention in the field (see the next section, Significant results). Moreover, seeing the growing need for the emerging high-mass-resolution imaging MS, we put more attention towards collecting high-resolution data. ICL collected and publicly provided through MetaboLights the following DESI-Orbitrap imaging datasets: 2D datasets of breast samples (MetaboLights accession number MTBLS273, 44.35 GB), 109 2D colorectal cancer sections (MTBLS289, 146.34 GB), and a 3D dataset of 51 liver sections (MTBLS282, 28.47 GB). This submission is currently the largest public collection of imaging MS data coming from one lab or project. For the first time, data for 3D MALDI-Orbitrap was collected by EMBL. In Task 2.3 the project partners were evaluating the methods, algorithms, and software developed in other WPs. This certainly challenging task was completed with great success.
First, all project partners and in particular UoB, EMBL, UR1, and UCSD were using the software SCiLS Lab 3D from SCiLS that already during the course of the project provided algorithms and methods developed in the project. This allowed SCiLS to collect the user feedback and other partners obtain biologically relevant results. UCSD and UoB studied the microbial interactions between microbes Candida albicans and Pseudomonas aeruginosa in 3D and discovered mass signals that were not observable using conventional 2D MALDI-IMS. It was determined that the presence of C. albicans triggered increased rhamnolipid production by P. aeruginosa, which in turn was capable of inhibiting embedded hyphal growth produced beneath the C. albicans colony at ambient temperature . This discovery can lead to new therapies of cystic fibrosis (CF) because both microbes are key microbial species forming biofilms in lungs of cystic fibrosis patients. The developed methodology was used in another publication . UCSD collected data for analysis of lichen (the publication Garg, Yi et al. is in the final stage of the preparation) and microbial colonies (the data was published as a part of Oetjen et al. (2015)). UR1 collected one of the first reported 3D MALDI-FTICR datasets of epididymis. Second, as a creative spinoff of the project in particular demonstrating the capacity developed in this consortium, UCSD, UoB, and EMBL developed a new workflow for 3D-surface analysis, that led to a publication in PNAS . The publication attracted great attention and was highlighted in more than 20 media including Nature, PNAS, CNN, Der Spiegel etc. (see the next section for more details); a figure from the paper was selected by the Nature journal as an Image of the Year. The data from the PNAS paper is published open-access at GNPS, the data repository developed by UCSD.
Significant results
Open-access 3D imaging MS data published by the consortium. We published the key selected 3D datasets from the consortium in the GigaScience journal ; UoB, EMBL, ICL, UCSD contributed. Since it was published in May, the paper was already cited several times and viewed more than 4500 times. The data from this paper is the first imaging MS submission to the MetaboLights, the official metabolomics repository at EBI, and was used in third-party presentations and posters at OurCon’15 and MetaboMeeting’15.

Largest open-access collection of 2D high-mass-resolution imaging MS data coming from one lab or project. ICL has deposited almost 200 DESI-Orbitrap datasets (200 GB) onto MetaboLights (http://www.ebi.ac.uk/metabolights/MTBLS273 http://www.ebi.ac.uk/metabolights/MTBLS289 http://www.ebi.ac.uk/metabolights/MTBLS282). Although initially not planned in the project, during discussions with the community, we realized that, still, 2D imaging MS datasets and in particular high-resolution imaging MS datasets, are hardly available to statisticians and bioinformaticians. This release represents the largest known public collection of such data.
3D surface analysis, aka 3D molecular cartography. The paper Bouslimani et al (2015) coming from the consortium was highlighted in various journals, popular science magazines and other media, such as Nature, PNAS, CNN, Der Spiegel, Wired, Health, Chemical & Engineering News, Focus Online, NIH Director’s blog, MedGadget, Business Insider, etc. Figures from this paper were selected for various art and science exhibitions in Europe and for the Image of the Year at Nature between photos of Pluto and California burning.

SCiLS Lab 3D for FT-ICR-imaging. In the original proposal, we planned to work only with the medium-mass-resolution (TOF) data. However, the use of FT-ICR mass analyzer for MALDI imaging is rapidly gaining popularity in the recent years. This is motivated by the fact that the high mass accuracy and mass resolving power enable accurate determination of exact masses and consequently, a more confident identification of molecular species. Notably, MALDI FT-ICR imaging datasets are much larger in size than TOF imaging datasets. They are up to 100 times larger due to a higher number of m/z-bins in FT-ICR and can reach up to 1 TB for one dataset. Mining such data is a considerable challenge. Seeing the increasing use of FT-ICR analyzers for imaging, SCiLS has performed additional work on adaptation of the developed unsupervised methods from WP5 (spatial segmentation, component analysis methods, measure of spatial chaos (cf. Task 5.2)) for the use in SCiLS Lab and in particular in SCiLS Lab 3D that has been performed in close collaboration with UR1 who evaluated the new provided capabilities in SCiLS Lab 3D.

WORK PACKAGE 3 “Analysis of variation & reproducibility”
Summary of progress towards objectives
WP3 focuses on the analysis of variation within imaging mass spectrometry datasets with an aim of selecting conditions to maximize data reproducibility.
To algorithmically evaluate the quality of an imaging datasets a quantitative measure for the quality of imaging mass spectrometry data was developed. We proposed and implemented within MATLAB eight families of potential measures with a range of parameters and statistical descriptors, giving 143 metrics in total. In particular, we implemented a novel measure based on spatial chaos within ion images as developed by us earlier in the project . The metrics were then scored for their ability to choose similarly as human test subjects the best image from a pair of images. To provide a ‘gold-standard’ set of validated human judgments for comparing the metrics against, we engaged the IMS community and recruited 85 imaging experts to rank a set of images (from datasets collected within WP2). This confirmed that our measure of spatial chaos performed well compared to human judgments as we shown in the devoted publication .
It was discovered that sectioning of heat stabilized tissue is challenging and a number of parameters affecting the sectioning results has been found. It has been shown that with the help of CryoJane, a tape transfer assisted sectioning technique by Leica, cryo sections of heat stabilized tissue can be made with good quality. The evaluation of heat inactivation using the CryoJane assisted sectioning protocol has been initiated but not yet evaluated.
A statistical model was developed based on the datasets collected within WP2 which indicated that the laser parameters had the most significant affect on the quality of imaging mass spectrometry data. A follow up study based on predicted optimal parameters was performed but results were not found to be reproducible.
Significant results
Quality measure of mass spectrometry images. The results of this WP were published as a paper, “Using collective expert judgements to evaluate quality measures of mass spectrometry images” by A Palmer, E Ovchinnikova, M Thune, R Lavigne, B Guevel, A Dyatlov, O Vitek, C Pineau, M Boren and T Alexandrov in Bioinformatics, 31, 2015, i375–i384, in the ISMB’15 special issue of Bioinformatics that normally has a lower acceptance rate than regular publications in Bioinformatics (the authors from the project are underscored). The essence of the paper is the results from Tasks 3.1 and 3.2 as partly reported in the 18-month report and integrated into a novel approach for defining and evaluating potential measures of quality of imaging MS data. Of the 143 metrics evaluated both signal-to-noise and spatial chaos-based measures performed highly with a correlation of 0.7 to 0.9 with the gold standard ratings. Moreover, we showed that a composite measure with the linear coefficients (trained on the gold standard with regularized least squares optimization and lasso) showed a strong linear correlation of 0.94 and an accuracy of 0.98 in predicting which image in a pair was of higher quality.
Involvement of the community into the first-in-the-field croudsourcing study. As a part of the paper Palmer et al. (2015) Bioinformatics, we have involved 97 experts in imaging MS to create a gold standard for quality of imaging MS data. The results as well as the constructed gold standard were published open-access in the paper and at https://github.com/alexandrovteam/IMS_quality; the expert with the closest consensus to the median results has received the following printed out poster as a memorabilia.

Optimization of sectioning of the heat-stabilised tissue. The CryoJane tape transfer assisted sectioning technique by Leica has been found to enable sections of good quality from heat stabilized tissue.

WORK PACKAGE 4 “Development of preprocessing methods”
Summary of progress towards objectives
In WP4 we have developed preprocessing methods for 3D MALDI-IMS data which were delivered to other partners for efficient implementation (in WP7) and are now in evaluation for pre-processing 3D MALDI-IMS data prior to unsupervised analysis (in WP1). In Task 4.1 we developed an algorithm for 3D edge-preserving denoising. Although there are many 3D denoising algorithms already available, specific developments were necessary for our project, because the voxels of 3D imaging MS data do not necessary form a 3D regular grid, but due to alignment of serial sections (registration) their pixels can be misplaced and after registration form a so-called cloud of spots in 3D. The development was performed by modifying a standard edge-preserving algorithm so that it can operate on a cloud of spots. In Task 4.2 we developed a conceptually new method for analysis of imaging MS data, which selects those m/z-values whose m/z-images are informative. This simulates visual examination of m/z-images by a mass spectrometrist but now this examination can be performed automatically. For quantifying the information content of an m/z-image, we developed a novel measure of spatial chaos, having high values for chaotic (non-informative) images and low values for informative images. The measure was statistically evaluated on test sets of m/z-images. In Task 4.3 we applied this measure for highly-sensitive peak picking in imaging MS data, since the values of the measure do not depend on average intensity of an m/z-image, and compared the new method with a conventional method for peak picking, based on applying spectrum-wise peak picking to individual spectra with selecting consensus peaks for the full imaging MS dataset.
Significant results
We expect that the method of selecting informative mass-to-charge channels by calculating measure of spatial chaos of their images (developed in Task 4.2 evaluated in Task 4.3 and published in 2013 will have a considerable impact in the field of imaging MS, because this is the first attempt to select mass spectrometry signals in an unsupervised way by individually going over mass-to-charge images. Moreover, as we highlighted in the publication, the measure of spatial chaos can be used to perform the test for present of known molecules in imaging MS data. Although it does not provide full confidence on whether a molecule is in the sample, because MS/MS confirmation is necessary, it can help by filtering out hypothetical molecules leaving a reduced number of molecules for follow-up evaluation using MS/MS. To support the uptake of this method by the community, we released to source code of the developed algorithm calculating the measure of spatial chaos. The development of the measure of spatial chaos was an enabling step towards the European H2020 project METASPACE on molecular annotation of high-resolution imaging MS data (coordinated by Theodore Alexandrov, started in July 2015).

WORK PACKAGE 5 “Development of unsupervised methods”
Summary of progress towards objectives
Work Package 5 deals with the development of unsupervised methods. New methods have been developed during the first reporting period and applied to real and simulated 2D and 3D MALDI-imaging data sets. In Task 5.1 different noise models have been investigated, namely an additive Gaussian noise model and a Poisson noise model. They have been implemented by using different discrepancy terms (Euclidean norm resp. Kullback-Leibler divergence). In Task 5.2 the results of task 5.1 have been collected and the methods have been adapted to be spatially and spectrally aware. Prior knowledge on spatial and spectral properties have been modelled by incorporating so-called penalties to the discrepancy functionals developed in Task 5.1. In the second reporting period, only Task 5.3 was completed. In this Task, component analysis methods and methods for spatial segmentation have been compared and applied to simulated data, and the methods have been generalized for MALDI FT-ICR data. The unsupervised approaches have been applied to several 3D real-life data sets.
Significant results
As a conclusion of the completed Tasks 5.1 und 5.2 we come to the following discoveries:
1. The Gaussian noise model and the Poisson noise model yield to similar results. Since the Poisson NMF-models are computationally much more expensive (empirically about factor 10 for large data sets as typical for 3D-MALDI-imaging) the results of the study on Poisson noise model are disappointing: The additional expenses for the more complicated Poisson noise model cannot be justified and we recommend to use a Gaussian noise prior.
2. Incorporating knowledge about “sparse spectra” (l1-penalty) and “sharp image edges” (TV-penalty for taking into account spatial properties) is promising. The results for NMF with l1-plus TV-penalty are convincing and they considerably outperform NMF without prior knowledge. The results have been published in 2013 by SCiLS and University of Bremen.
In the second reporting period, after comparing the component analysis methods with spatial segmentation applied them to both simulated and real-life data, we concluded that the spatial segmentation is a perfect approach for automatic spatial annotation of data with non-overlapping segments. In contrast to that, the component analysis methods are able to un-mix the complex composition of the entire data set by describing potentially overlapping distributions.
WORK PACKAGE 6 “Development of supervised methods”
Summary of progress towards objectives
Work package 6 is primarily concerned with the development and application of supervised statistical methods to 3D imaging datasets. In the first reporting period, we developed the necessary spectral pre-processing routines. Such routines were incorporated within Task 6.1 with the development and testing of supervised methods being performed in Tasks 6.2 and 6.3. Within Task 6.1 various normalization and transformation methods were applied to DESI imaging data with the effectiveness of each technique clearly demonstrated. In the second reporting period, performing work on Task 6.3 we compared existing supervised methods for pixel prediction, as well the application of spatial smoothing development by Alexandrov and Kobarg. Whilst this method is excellent for the high-spatial resolution technique of MALDI, its application to DESI data with a considerably lower spatial resolution results in poor pixel classification. Having been initially trialled on 2D samples, the supervised techniques were applied to the processed 3D liver and colorectal samples. The high section-to-section consistency demonstrates not only a good supervised analysis but also a robust processing workflow that successfully reduces the effects of non-biological variation likely to be introduced during the multi-day sample acquisition.

Significant results
Various normalization methods were tested using imaging data from a single homogeneous tumour, where median fold change normalization was found to provide the most robust method in reducing the coefficients of variation of multiple lipid species. Log-transformation of the data was found to effectively standardise the variance across low- and high-intensity variables such that the resultant multivariate statistical approaches provided more conclusive between-tissue type scattering. A novel approach to registration of the optical and MS images has been developed such that histological features identified on the optical image can be easily mapped onto the MS image for use in multivariate statistical methods.
As the main results on the project, we published a paper in PNAS, one of the highest impact journals. The key approach of the paper was originally applied to 2D tissue sections and operates with limited user interaction, thus producing highly objective co-registrations. The approach is crucial for the application of statistical methods, as it permits pixel-wise annotation of a training set of different tissue types thus allowing other MS data to be predicted from it. The 3D datasets generated as part of the project have been widely disseminated at conferences and seminars. The data processing workflow is currently being prepared for publication along with consideration of the heterogeneity of the tissue itself. This is an exciting application of 3D MS imaging as it can be used to demonstrate the changing nature of tissue types throughout the depth of a sample. This can be subsequently used in the context of the analysis of single tissue sections and what can be concluded from them without reference to the remainder of the surrounding tissue.

Work progress and achievements in WORK PACKAGE 7
“Efficient GPU implementation”

Summary of progress towards objectives
The main two challenges from a compute view point for MALDI-imaging are the large size of the data involved which can be dozens of GB of information and the complex algorithms that are required to analyse this data. These challenges were the motivation for the work done in WP7, where the main aim was to develop GPU implementations to serve the data and compute intensive algorithms on the MALDI images.
In order to develop and evaluate statistical methods for analysing 3D MALDI imaging, SCiLS, UoB, EMBL and ICL defined and delivered to SagivTech selected algorithms developed in WP4, WP5, WP6 (along with the data sets) to implement and test on GPUs.

During the first half of the 3D-MASSOMICS project the GPU infrastructure was developed and work was done for the highly sensitive image peak picking algorithm (Task 7.2). In addition, GPU implementation of unsupervised methods was developed (Task 7.3) and preliminary work was done on GPU implementation of supervised methods (Task 7.4).
During the second half of the project there was a focused effort towards Milestone 3 in month 36 of the project: Efficient GPU Implementations. There were several discussions among the partners: SagivTech, SCiLS, UoB, EMBL and ICL to decide on the algorithms that require compute acceleration and how to efficiently handle the data within the MALDI analysis pipeline. SagivTech has collaborated with the other partners to understand their compute needs and assess the suitability of the algorithms and use cases to the GPU environment.
Relying on the work already done on the first half of the project, we could evaluate where GPU acceleration is further needed, where the bottlenecks lie in the data processing and where the use of more than a single GPU makes sense. Following the discussions, the partners agreed that the most important tasks with respect to efficient GPU implementation and towards achieving the goal of Milestone 3 are as follows:
1. Further explore efficient GPU implementations for unsupervised methods, such as: Hierarchical Clustering, PCA using SVD and PLSA.
2. Further explore efficient GPU implementations for supervised methods, such as: NIPLAS and SVM
3. Assess the capability for further compute acceleration by moving to a multi GPU system for specific use cases, e.g. MOC.
As ICL informed SagivTech that GPU acceleration of more supervised methods (beyond NIPLAS and SVM) is not required, SagivTech discussed with the partners, SCiLS, EMBL, UoB and ICL directions that involve GPU computing that can be of interest to the 3D-MASSOMICS consortium. In the consortium meeting on November 2014, in Heidelberg, SagivTech suggested to explore Deep Learning methods for MALDI imaging. It was agreed in that consortium meeting that SagivTech, SCiLS and EMBL will explore use cases for evaluation of Deep Learning methods for MALDI imaging.

Deep learning is an emerging field in machine learning in general, and specifically in signal/image analysis. Deep Learning involves the use of complex, multi-level “deep” neural networks for both supervised and unsupervised machine learning tasks. For example, voice recognition (speech to text) in mobile devices today is accomplished via Deep Learning networks. It is also the technology behind most image search engines today and at the front of current research both in the academy and in giant corporations such as Facebook, Google and Microsoft.

It is important to state that in many fields there is still a shortage of training data that makes application of Deep Learning difficult. This is also true for MALDI imaging.

SagivTech identified Deep Learning as an important tool for further innovation in 3D-MASSOMICS research where it has a competitive advantage as a knowledge center for GPU computing, machine learning and computer vision. Moreover, this technique is relatively new to the field of MALDI imaging. In a literature survey done there were no published works on the application of Deep Learning methodologies to MS.

Therefore, in addition to the task of migrating traditional machine learning methods to GPU platforms, SagivTech, together with SCiLS and EMBL, did research on the application of Deep Learning to MALDI images where the main work was done in the following directions:
a. Building a Deep Learning know how and infrastructure.
b. Assessing Deep Learning methods for MALDI imaging with a use case of spectra selection.

In addition, it was also agreed to do exploratory research work and to assess Deep Learning methods for unsupervised methods for MALDI imaging with a use case of auto encoder as an alternative approach for dimensionality reduction and to assess Deep Learning methods for supervised methods for MALDI imaging with a use case of classification as an alternative approach for SVM.

Significant results
MALDI-imaging offers a rich source of information for analysis and research. As advancement in acquisition methods provides more information, and analysis algorithms become more sophisticated, it is obvious that fast computation methods are critical. MALDI-imaging has become a Big Data type of problem in the broad sense and data transfer and analysis threaten to become major bottlenecks. In this work package we have worked to offer biologists, algorithms developers and clinicians fast computing implementations of the MALDI-imaging related algorithms, and to offer a solution to the Big Data bottleneck. The main result of the work done in WP7 is the GPU implementation of machine learning algorithms that run on MALDI data. Milestone 3 is accomplished as Deliverable 7.1 is submitted at the end of the project. This is an important result since the amounts of MALDI data are expected to continue to increase.

We have brought to the attention of the chemists and biologists in the project the immense compute capabilities that GPUs can offer and together with SCiLS, UoB, EMBL and ICL allow a faster working environment that is also user friendly for scientists who are not experts in computer science.

In addition to providing the GPU implementations in Milestone 3, we also conducted state of the art research on the application of Deep Learning methodologies on MALDI imaging. We got very promising results for a Deep Learning based measure of structure. It seems worthwhile to further explore Deep Learning in the context of MALDI imaging and to generate appropriate databases for this purpose.

Work progress and achievements in WORK PACKAGE 8
“Scientific Coordination”
Summary of progress towards objectives
The start of the project was on 01.11.2012 but due to the delayed and unexpected notification received only on 05.11.2012 (see Task 8.1) not all partners could hire staff to start on time. To minimize the negative effects of this unsynchronized start of the project, at the beginning the channels for intensive communication were established including meetings in small groups over phone that allowed the consortium to integrate the activities in the first reporting period, achieve all objectives, and complete the work as planned with minor deviations only.
The scientific coordination in the second reporting period was targeted towards successful achievement of the project objectives. Particular attention was paid towards organisation of continuous scientific coordination during the process of the move of the Coordinator from UoB to EMBL in October 2014 and towards organisation of scientific work of the Coordinator team at the new place at EMBL.
The Work Package Leaders coordinated the activities and monitored progress within their work packages. The Steering Board met every half a year during project meetings, in Bremen (kick-off meeting), Uppsala (2nd meeting), San Diego (3rd meeting), Heidelberg (4th meeting), Rennes (5th meeting), and Pisa (concluding meeting after the final symposium). Preparing the consortium to the end of the project, ERS has provided a roadmap and necessary templates. The Advisory Board was well involved into the project. The Coordinator took care of the communication with the European Union.
Significant results
All three milestones (“Synthetic data, statistical analysis of variation, preprocessing methods”, month 12; “New methods for statistical analysis”, month 24; “Efficient GPU implementation and expert evaluation”, month 36) were successfully achieved. For the first milestone, the first-level evaluation of the achievement of the milestone was performed by two members of the advisory board (Olga Vitek, Christian Barillot). The second evaluation was performed by the scientific community through publications, presentations, and posters resulted from the work on the project. Altogether, 13 journal publications were published in 2013. The third evaluation was performed by project member during the third project meeting in San Diego (in 15th month of the project). For the 2nd and 3rd milestones, the first-level evaluation of the achievement of the milestones was performed by the Advisory Board experts Axel Walch and Olga Vitek. For the 2nd milestone, the Coordinator was meeting with Axel Walch at MSACL EU’14 conference in Salzburg, in September 2014 and with Olga Vitek at US HUPO’14 conference in March 2015. For the 3rd milestone, both Axel Walch and Olga Vitek attended OurCon3, the final symposium of the project. The second evaluation was performed by the scientific community through publications, presentations, and posters resulted from the work on the project. Altogether, 14 journal publications were published in the second reporting period. The results of the project were presented at the final symposium of the project organized together with the European BMBS COST action BM1104 on Mass Spectrometry Imaging (coordinated by Liam McDonnell) as the OurCon3 conference, attended by 197 participants.

The third evaluation was performed by the project partners during the final project meeting in Pisa after the final symposium (Oct 2015).
Specific activities contributed to the achievement of the 1st milestone:
WP1: Simulator was developed, synthetic datasets were simulated and uploaded to the project FTP. Evaluation strategies are developed.
WP2: Repeated measurements are collected. The samples for applications (microbial imaging, diabetes, and surgical metabolomics) were obtained and the data was collected.
WP3: Statistical approach to analysis of MALDI-IMS variation was developed.
WP4: Pre-processing methods were developed. The highly sensitive peak picking was compared with conventional peak picking, the results were published in the Bioinformatics journal.
WP5: Comparison of component analysis methods was performed.
WP6: Supervised methods from chemometrics were adapted to MALDI-IMS.
WP7: GPU basic library is delivered to all partners, the preprocessing methods were GPU-implemented.
Specific activities contributed to the achievement of the 2nd and 3rd milestones:
WP2: 3D MALDI- and DESI- imaging data was collected and published open-access at MetaboLights and in the GigaScience journal. A new workflow for 3D surface imaging was developed, presented at many conferences and published in PNAS, highlighted in more than 20 media, with new software `ili published open-source. SCiLS Lab was evaluated by many partners, in particular for the new type of data, MALDI-FTICR-imaging. Several publications are in preparation.
WP3: Sources of variation have been studied. Quality measure of mass spectrometry images were developed, evaluated by the community (97 experts), and published in Bioinformatics as open-access.
WP5: Spatial segmentation was compared to the component analysis methods, implemented in the SCiLS Lab 3D software, both for the MALDI-TOF and the new MALDI-FTICR imaging data.
WP6: New supervised methods were developed. The implementations from Tasks 6.1 and 6.2 were optimized. The results were published in PNAS and Chem Comm.
WP7: GPU implementations were developed and provided to project partners. In addition, Deep Learning, a machine learning technique of high interest and potential, was evaluated.
WP8: Scientific coordination was continuously performed. Project meetings organized as planned. The second period report and the final report were prepared.
WP9: 14 journal publications were published, with numerous conference presentations and posters. A final symposium attracted 197 participants. Intermediate open workshop attracted 50 participants. The results of 3D-MASSOMICS enabled a new H2020 European project. An open-source software app `ili was released.
WP10: Non-scientific management was carried out to support all WPs.

WORK PACKAGE 9 “Dissemination & Training”
Summary of progress towards objectives
Work package 9 deals with the dissemination and training activities of 3D-MASSOMICS.
The aim of these activities is twofold: the first involves transfer of knowledge between the project’s partners. 3D-MASSOMICS involves deep expertise in several domains, and each partner has a strong knowledge base in different aspect of the project. In order to enhance the professional communication between the partner and advance the execution of the project the second and third project meetings were accompanied by dedicated workshops on various topics such as heat stabilized tissue in MALDI-imaging, GPU computing and the parallel way of thinking, usage of software for analysis and a hands on workshop on 3D imaging of microbial colonies.
The first reporting period
The project partners have organized the EuPA 2013 3D Imaging Mass Spectrometry workshop in 2013. 16 journal publications were published, and 3D-MASSOMICS was disseminated in numerious occasions. There is an open channel of communication with end users and clinicians and the coordinator is leading the effort to discuss use and exploitation of the results. The projects website was up and maintained on a weekly basis.
The second reporting period
External training aims at clinical labs. Two open workshops were held during 2014. A 1-day workshop on MALDI Imaging Lab Core Facility was conducted by UoB in Bremen and a 1-day seminar was organized by UoB together with UR1 and SCiLS just prior to the 2014 HUPO conference in Madrid. Denator held a well-attended presentation about heat stabilization and its application for MALDI imaging during the advanced Imaging Mass Spectrometry (AIMS) workshop at Vanderbilt University at Nashville, TN, USA. SCiLS organized a workshop at the Bruker Day in Kassel (March 2015) and gave a training on using SCiLS Lab software at various research institutes in Europe. In addition, as a follow up activity to the 3D-MASSOMICS project UR1 organizes the 1st Workshop on Imaging Mass Spectrometry (WIMS) - a practical course and hands-on training for experts and beginners. This event is scheduled for March 2016.
3D-MASSOMICS was well disseminated in diverse several occasions, from MS-focused events to bioinformatics and computational conferences. The OurCon3 conference took place as the joint final symposium of 3D-MASSOMICS and the European BMBS COST action BM1104 on Mass Spectrometry Imaging. OurCon3 attracted considerable attention among imaging MS community especially among its European part, with 197 participants, two special sessions organized by 3D-MASSOMICS, speakers from the project and the AB, and a devoted talk summarizing the results of the project.
EMBL has developed `ili, an open-source Google Chrome App for molecular mapping in 2D and 3D (https://github.com/ili-toolbox/ili) that is freely available on Google Web Store and by 15.12.2015 already has 98 users. The novel NMF approach incorporating spatial and/or spectral priors developed in Task 5.2 that was published in [Bartels et al., Inverse Problems (2013)] is scheduled for integration into future software versions of SCiLS Lab 3D. The project partner Steinbeis Center SCiLS Research has started discussions and negotiations with SCiLS GmbH (Bremen, Germany) on the roadmap to achieve it.
SagivTech together with SCiLS are assessing the commercial potential of the GPU based library (Deliverable 7.1) and is further exploring the commercial potential of exploiting Deep Learning-based approaches for MALDI-imaging.

Significant results
1. The final project symposium OurCon3 (http://ourcon.org) was organized jointly with the European BMBS COST action BM1104 that attracted 197 participants.
2. Our workshop on 3D imaging MS at EuPA’13 has attracted 40 participants.
3. Our workshop on 3D imaging MS at HUPO’14 attracted 50 participants.
4. 29 peer-reviewed journal publications were published, 10 of them open-access.
5. 1 proceedings publications was published.
6. 75 oral presentations were given and 28 posters were presented; 2 oral presentations and 1 poster presentation are expected after the project ends.
7. A new European H2020 project METASPACE (coordinated by EMBL) was enabled in particular through results and collaboration established in 3D-MASSOMICS; the results of 3D-MASSOMICS will be exploited in METASPACE.
8. EMBL released `ili, an open-source Google Chrome App freely available on the Google Web Store with 98 users since the release in October 2015.
9. The project partner Steinbeis Center SCiLS Research started negotiations with SCiLS GmbH (Bremen, Germany) to provide the developed methods and implementations to potential users as a part of commercially available package SCiLS Lab 3D.
10. The industrial partners (SCiLS, Denator, SagivTech) are actively exploring ways of exploitation the results of 3D-MASSOMICS.

Peer-reviewed journal publications
From overall 29 publications (see the list in the report on WP9), 5 received special recognition or were highlighted:
• Watrous et al. (2013) ISME J was awarded “F1000 Prime Recommended” title by F1000Research, an open access scientific journal covering the life sciences
• Veselkov et al. (2014) ChemComm was a feature and cover article of the journal
• Veselkov et al. (2014) PNAS was highlighted in Practical Patient Care .
• Oetjen et al. (2015) GigaScience was in the list of the Top-20 Most Viewed publications in the journal between June and October 2015.
• Bouslimani et al. (2015) PNAS was highlighted in Nature, PNAS, CNN, Der Spiegel, Wired, Scientific American, NIH Director’s Blog, ScienceWorld, Health, FOCUS Online, Die Welt, Chemical & Engineering News (ACS), Business Insider, MedGadget, Profil, Boston Globe, Shape Magazine, Eos magazine, Refinery29, CosmeticsDesign-Europe.com Fast Co.Exist Business Standard (India), and ZeeNews (India); Figures from this paper were selected for various art and science exhibitions in Europe and for the Image of the Year at Nature .

[1] Song C, Mazzola M, Cheng X, Oetjen J, Alexandrov T, Dorrestein P, Watrous J, van der Voort M & Raaijmakers JM (2015) Molecular and chemical dialogues in bacteria-protozoa interactions. Scientific Reports, 5, 12837 [open-access]
[2] Bouslimani A, Porto C, Rath CM, Wang M, Guo Y, Gonzalez A, Berg-Lyon D, Ackermann G, Christensen GJM, Teruaki N, Zhang L, Borkowski AW, Meehan MJ, Dorrestein K, Gallo RL, Bandeira N, Knight R, Alexandrov T, Dorrestein P (2015) 3D molecular cartography of the human skin surface. PNAS, E2120-E2129 [open-access]
[3] Oetjen J, Veselkov K, Watrous J, McKenzie JS, Becker M, Hauberg-Lotte L, Strittmatter N, Mróz AK, Hoffmann F, Trede D, Kobarg JH, Palmer A, Schiffler S, Steinhorst K, Aichler M, Goldin R, Guntinas-Lichius O, von Eggeling F, Thiele H, Maedler K, Walch A, Maass P, Dorrestein P, Takats Z, Alexandrov T (2015) Benchmark datasets for 3D MALDI- and DESI-Imaging Mass Spectrometry. GigaScience, accepted [open-access]
(4)Palmer A, Ovchinnikova E, Thune M, Lavigne R, Guevel B, Dyatlov A, Vitek O, Pineau C, Boren M, Alexandrov T (2015) Using collective expert judgements to evaluate quality measures of mass spectrometry images. Bioinformatics, N° 31(12), i375-i384 [open-access]
[5] Palmer A, Alexandrov T (2015) Serial 3D imaging mass spectrometry at its tipping point. Analytical Chemistry, 87(8), 4055-4062
[6] Klein O, Strohschein K, Nebrich G, Oetjen J, Trede D, Thiele H, Alexandrov T, Giavalisco P, Duda GN, von Roth P, Geissler S, Klose J, Winkler T. MALDI-imaging mass spectrometry: Discrimination of pathophysiological regions in traumatized skeletal muscle by characteristics peptide signatures. Proteomics, Vol. 14 (21-22), 2630, 2014
[7] Krasny Lukas, Hoffmann Franziska, Ernst Guenther, Trede Dennis, Alexandrov Theodore, Havlicek Vladimir, Guntinas-Lichius Orlando, von Eggeling Ferdinand, Crecelius Anna C., Spatial segmentation of MALDI FT-ICR MSI data: A powerful tool to explore head and neck tumor in situ lipidome, Journal of The American Society for Mass Spectrometry, 26(1), 36-43, January 2015.
[8] Diehl Hanna C., Beine Birte, Elm Julian, Trede Dennis, Ahrens Maike, Eisenacher Martin, Marcus Katrin, Meyer Helmut E. and Henkel Corinna, The challenge of on-tissue digestion for MALDI-MSI – a comparison of different protocols to improve imaging experiments, Analytical and Bioanalytical Chemistry, 407(8), 2223-2243, March 2015.
[9] Lagarrigue et al., Localization and in situ absolute quantification of chlordecone in the mouse liver by MALDI Imaging, Analytical Chemistry, 86, 5775-5783, 2014.
[10] Golf O, Muirhead LJ, Speller A, Balog J, Abbassi-Ghadi N, Kumar S, Mroz A, Takats Z and Veselkov K *, 2015, XMS: Cross-Platform Normalization Method for Multimodal Mass Spectrometric Tissue Profiling, Journal of American Society of Mass Spectrometry, Vol: 26, Pages: 44-54, ISSN: 1044-0305 * Corresponding Author.
[11] Guenther S, Muirhead LJ, Speller AVM, Golf O, Strittmatter N, Ramakrishnan R, Goldin RD, Jones E, Veselkov K, Nicholson J, Darzi A, Takats Z, 2015, Spatially Resolved Metabolic Phenotyping of Breast Cancer by Desorption Electrospray Ionization Mass Spectrometry, Cancer Research, Vol: 75, 1828-1837, ISSN: 0008-5472
[12] Kumar S, Huang J, Abbassi-Ghadi N, Mackenzie HA, Veselkov KA, Hoare JM, Lovat LB, Španěl P, Smith D, Hanna GB, 2015, Mass Spectrometric Analysis of Exhaled Breath for the Identification of Volatile Organic Compound Biomarkers in Esophageal and Gastric Adenocarcinoma, Annals of Surgery.
[13] Veselkov KA* and Abbassi-Ghadi N*, Kumar S, Huang J, Jones E, Strittmatter N, Kudo H, Goldin R, Takats Z and Hanna GB (2014), Discrimination of lymph node metastases using desorption electrospray ionisation-mass spectrometry imaging, Chemical Communications, 50(28), 3661-3664.
[14] Faouder JF, Laouirem S, Alexandrov T, Ben-Harzallah S, Albuquerque M, Bedossa P, Paradis V (2014) Tumoral heterogeneity of intrahepatic cholangiocarcinomas revealed by MALDI imaging mass spectrometry. Proteomics, 14(7-8), 965-972, 2014.
[15] Ernst G, Guntinas-Lichius O, Hauberg-Lotte L, Trede D, Becker M, Alexandrov T, von Eggeling F (2014) Histomolecular interpretation of pleomorphic adenomas of the salivary gland by MALDI imaging and spatial segmentation. Head & Neck, 37(7), 1014-1021, 2015.
[16] Hoelscher D, Dhakshinamoorthy S, Alexandrov A, Becker M, Bretschneider T, Buerkert A, Crecelius AC, De Waele D, Elsen A, Heckel DG, Heklau H, Hertweck H, Kai M, Knop K, Krafft C, Maddula RK, Matthaeus C, Popp J, Schneider B, Schubert US, Sikora RA, Svatos A, Swennen RL (2013) Phenalenone-type phytoalexins mediate resistance of banana plants (Musa spp.) to the burrowing nematode Radopholus similis. PNAS, 111(1),105-110 [open-access]
[17] De Ridder J, Bromberg, Y, Michaut M, Satagopam VP, Corpas M, Macintyre G, Alexandrov T (2013) The young PI buzz: Learning from the organizers of the Junior Principal Investigator Meeting at ISMB-ECCB 2013. PLoS Computational Biology 9(11): e1003350 [open-access]
[18] Alexandrov T, Bourne PE (2013) Learning how to run a lab: Interviews with principalinvestigators. PLoS Computational Biology 9(11): e1003349 [open-access]
[19] Bartels A, Duelk P, Trede D, Alexandrov T, Maass P (2013) Compressed sensing in imaging mass spectrometry. Inverse Problems, 29(12), 125015-125039
[20] Alexandrov T, Chernyavsky I, Becker M, van Eggeling F, Nikolenko S (2013) Analysis and interpretation of imaging mass spectrometry data by clustering mass-to-charge images according to their spatial similarity. Analytical Chemistry, 85(23), 11189-11195.
[21] Alexandrov T, and Bartels A (2013) Testing for presence of known and unknown molecules in imaging mass spectrometry. Bioinformatics, 29(18), 2335-2342 [open-access]
[22] Alexandrov T, and Lasch P (2013) Segmentation of Confocal Raman Microspectroscopic Imaging Data using Edge-Preserving Denoising and Clustering. Analytical Chemistry, 85(12), 5676–5683
[23] Rath CM, Yang JY, Alexandrov T, and Dorrestein PC (2013) Data-independent microbial metabolomics with ambient ionization mass spectrometry. Journal of American Society for Mass Spectrometry, 24(8),1167-1176.
[24] Pote N, Alexandrov T, Lefaouder J, Laouirem S, Léger T, Mebarki M, Belghiti J, Camadro JM, Bedossa P, and Paradis V (2013) Imaging mass spectrometry reveals modified forms of Histone H4 as new biomarkers of microvascular invasion in hepatocellular carcinomas. Hepathology, 58(3), 983-994, September 2013.
[25] Watrous J, Roach P, Heath B, Alexandrov T, Laskin J, Dorrestein PC (2013) Metabolic profiling directly from the Petri dish using nanoDESI imaging mass spectrometry. Analytical Chemistry, 2013, 85(21), 10385-10391.
[26] Oetjen J, Aichler M, Trede D, Strehlow J, Berger J, Heldmann S, Becker M, Gottschalk M, Kobarg JH, Wirtz S, Schiffler S, Thiele H, Walch A, Maass P, and Alexandrov T (2013) MRI-compatible pipeline for three-dimensional MALDI imaging mass spectrometry using PAXgene fixation. Journal of Proteomics, 90, 2 September 2013, 52-60.
[27] Traxler MF, Watrous JD, Alexandrov T, Dorrestein PC, Kolter R (2013) Interspecies interactions stimulate diversification of the Streptomyces coelicolor secreted metabolome. mBio, 4(4):e00459-13, [open-access]
[28] Watrous JD, Phelan W, Hsu CC, Moree WJ, Duggan BM, Alexandrov T, and Dorrestein PC (2013) Microbial metabolic exchange in 3D. ISME J., 7(4),770-780 doi:10.1038/ismej.2012.155. [open-access]
[29] Amina Bouslimani, Laura M. Sanchez,b Neha Garg and Pieter C. Dorrestein, Mass spectrometry of natural products: current, emerging and future technologies, Natural Product Reports, vol. 31(6), Royal Society of Chemistry, 31(6), January 2014, 718-729.

WORK PACKAGE 10 “Non-Scientific Project Management”
Summary of progress towards objectives
The first reporting period has been concluded successfully. The start in November has caused a slow start of the project but the involved beneficiaries have fulfilled their roles and commitments and the project is very dynamic and well on track. The management structure and procedures were working out well. There were neither significant issues nor significant deviations from the workplan.
The second period has been concluded successfully. The work plan was amended due to the addition of EMBL as new beneficiary and the change of coordinator from UoB to EMBL. No other significant deviation in the project management can be reported. All objectives have been achieved.
Significant results
The management structure and procedures were working out well. The involved beneficiaries have fulfilled their roles and commitments.

UoB, EMBL, and SCiLS has set up the project website http://3D-MASSOMICS.eu which has been kept up-to-date.

Potential Impact:
Summary on the project impact
The long-term overarching aim of the project was to make 3D label-free proteomics and metabolomics possible and practical by providing necessary computational tools to a biochemist, biologist, or clinical chemist to help them make sense out of big 3D imaging MS data.
We believe that have achieved this ambitious overarching aim by developing algorithms, evaluating them, implementing the best ones in the way available to biologists through the software SCiLS Lab 3D and `ili, and demonstrating them in case studies. Whereas in 2011 only a few advanced labs had expertise and resources to perform 3D imaging MS studies, now we are seeing rapid increase in the number of such studies all across the world. During OurCon3, the final symposium of 3D-MASSOMICS, 3D imaging MS was one of the central topics. The results of the project were of highest interest for many participants of the symposium.

Expected strategic impact of 3D-MASSOMICS, compared to our vision from 2011
In the following, we reflect on our aims of the strategic project impact as it was formulated in the original proposal in 2011. This puts our results and achievements in the context and allows us to better see how 3D-MASSOMICS will change the future by looking back into 2011.
By filling the empty computational niche, the 3D-MASSOMICS project will establish a leading role of the European research community in computational 3D IMS, and will multiply the influence of European research on the fast-growing MALDI-IMS-loyal biomedical community. (From the proposal, 2011.)
This aim was achieved with a great success. 3D-MASSOMICS with various dissemination activities, in particular with 2 workshops on 3D imaging MS in St. Malo, France, and Madrid, Spain, and the 2 devoted sessions at the final symposium in Pisa, Italy, has created a European center of gravity in the field of 3D imaging MS. The methods developed in the project will be integrated into the software SCiLS Lab 3D that is currently the only complete professional solution for 3D imaging MS, made in Germany. The involvement of a partner from USA (UCSD) has allowed us to disseminate our results in USA and to involve quite a large part of the USA-centered community. The collaboration with the European Bioinformatics Institute and GigaScience on publishing first open-access 3D imaging MS datasets has opened this type of data to bioinformaticians and statisticians.
We will provide methods for the analysis and interpretation of large 3D MALDI imaging mass spectrometry data enabling unlabelled metabolomics, lipidomics, and proteomics analysis in 3D. (From the proposal, 2011.)
We have developed or adapted, evaluated and compared numerous computational methods for pre-processing (normalization), unsupervised analysis (including spatial segmentation, component analysis, spatially-aware and Poisson NMF and PLSA, measure of spatial chaos), supervised analysis (SVM, spatially aware supervised classification). We have provided them to the community as open-source prototype implementations and open-source Google Chrome app (`ili, https://github.com/ili-toolbox/ili). The project partner Steinbeis Center SCiLS Research is working together with SCiLS GmbH (Bremen, Germany) on a roadmap to provide the developed methods and implementations to potential users as a part of commercially available package SCiLS Lab 3D. We demonstrated them in several case studies, including microbial metabolomics and proteomics. The methods were effectively disseminated that was confirmed by the high numbers of citations to our publications (among most cited: 65 citations of Traxler et al. (2013) MBio, 27 citations of Veselkov et al. (2014) PNAS, 29 citations of Watrous et al. (2013) ISME J, 15 citations of Bouslimani et al. (2015) PNAS), number of views of selected publications (the paper by Oetjen et al. was viewed more than 5000 times since its publication in May 2015), and the highlights our publications received in media (Nature, PNAS, CNN, Der Spiegel, Wired, Scientific American, NIH Director’s Blog, ScienceWorld, Health, FOCUS Online, Die Welt, Chemical & Engineering News (ACS), Business Insider, MedGadget, Profil, Boston Globe, Shape Magazine, Eos magazine, Refinery29, CosmeticsDesign-Europe.com Fast Co.Exist Business Standard (India), and ZeeNews (India)).
We will introduce a statistical analysis of variation approach into MALDI-IMS field and will apply it for evaluation of mixed variation effects and for developing reproducible protocols for data acquisition by optimizing the sample preparation and acquisition steps in a statistically sound manner. (From the proposal, 2011.)
This aim was not achieved with the same level of success. On one hand, we have developed the framework for studying the mixed variation effects as a part of the publication by Palmer et al. (2015) with the preliminary results shown by Oetjen et al. as a poster at OurCon3 in Pisa, Italy in Oct 2015. On the other hand, application of this framework showed that sometimes factors having most impact on variation are not those which we expect. In our study, it was shown that the laser size had most impact on variation that was unexpected. This creates the need for a larger scale data collection with more parameters varied before one can provide specific SOPs. This was beyond the scope of the project. However, the work by Oetjen et al. is going to be continued in collaboration of UoB and SCiLS with Bruker Daltonics, since these institutes have a strong interest in applying the statistical framework developed in the project for formulating better protocols and ultimately the SOPs.
We will address the need created by proof-of-principle demonstrations of 3D MALDI-IMS and by the potential of 3D imaging in biomedical applications. (From the proposal, 2011.)
We have addressed it by providing the methods and software packages. In the project, we had a relatively high percentage of publications published in general journals (3; PNAS) medical journals (3; Cancer Research, Annals of Surgery, Head and Neck) and in biological journals (2; mBio, ISME J). These publications provide for the scientific community avenues of applications of the methods developed in this project in biomedical applications.
We will address problems posed by our biomedical partners, develop easy-to-use methods, provide efficient GPU implementation for routine use, and integrate them into biomedical research. (From the proposal, 2011.)
The biomedical researchers, algorithms developers and GPU computing experts jointly evaluated the need for compute acceleration for the machine learning algorithms developed in the project. The selected algorithms were submitted to GPU migration along with the accompanying data sets. These algorithms include the highly sensitive image peak picking algorithm, unsupervised methods such as Hierarchical Clustering, PCA using SVD and PLSA and supervised methods including NIPLAS and SVM. The GPU code developed was delivered to the project partners and the impact of the compute acceleration was evaluated. The GPU acceleration reached 1-2 orders of magnitude (10-100 times) depending on the algorithm and data structure. Research was also done on the usage of a multi GPU system. The Applications Programming Interface (API) and data Input/Output were designed for ease of use by the project partners. In addition to the task of migrating traditional machine learning methods to GPU platforms, the partners conducted state of the art research on the application of Deep Learning methodologies on MALDI imaging.

Summary of the main dissemination activities
1. The final project symposium OurCon3 (http://ourcon.org) was organized jointly with the European BMBS COST action BM1104 that attracted 197 participants.
2. Our workshop on 3D imaging MS at EuPA’13 has attracted 40 participants.
3. Our workshop on 3D imaging MS at HUPO’14 attracted 50 participants.
4. 29 peer-reviewed journal publications were published, 10 of them open-access.
5. 1 proceedings publications was published.
6. 75 oral presentations were given and 28 posters were presented; 2 oral presentations and 1 poster presentation are expected after the project ends.
7. EMBL released `ili, an open-source Google Chrome App freely available through the Google Web Store.
8. The project partner Steinbeis Center SCiLS Research is working together with SCiLS GmbH (Bremen, Germany) on a roadmap to provide the developed methods and implementations to potential users as a part of commercially available package SCiLS Lab 3D.
9. The industrial partners (SCiLS, Denator, SagivTech) are actively exploring ways of exploitation the results of 3D-MASSOMICS.

Peer-reviewed journal publications
From overall 29 publications (see the list in the report on WP10), 5 received special recognition or were highlighted:
• Watrous et al. (2013) ISME J was awarded “F1000 Prime Recommended” title by F1000Research, an open access scientific journal covering the life sciences
• Veselkov et al. (2014) ChemComm was a feature and cover article of the journal
• Veselkov et al. (2014) PNAS was highlighted in Practical Patient Care .
• Oetjen et al. (2015) GigaScience was in the list of the Top-20 Most Viewed publications in the journal between June and October 2015.
• Bouslimani et al. (2015) PNAS was highlighted in Nature, PNAS, CNN, Der Spiegel, Wired, Scientific American, NIH Director’s Blog, ScienceWorld, Health, FOCUS Online, Die Welt, Chemical & Engineering News (ACS), Business Insider, MedGadget, Profil, Boston Globe, Shape Magazine, Eos magazine, Refinery29, CosmeticsDesign-Europe.com Fast Co.Exist Business Standard (India), and ZeeNews (India); Figures from this paper were selected for various art and science exhibitions in Europe and for the Image of the Year at Nature .

Exploitation of the project results
3D-MASSOMICS generated a large body of results such as novel methods, reports on successful applications of methods, and software implementing these methods making them available outside of the project. The following specific directions of exploitation of the project results are either already active or can be foreseen:
• A new European H2020 project METASPACE (coordinated by EMBL, started in July 2015) was enabled in particular through results and collaboration established in 3D-MASSOMICS; the results of 3D-MASSOMICS will be exploited in METASPACE.
• The project partner Steinbeis Center SCiLS Research is working together with SCiLS GmbH (Bremen, Germany) on a roadmap to provide the developed methods and implementations to potential users as a part of commercially available package SCiLS Lab 3D.
• The open-source Google Chrome app `ili will be further developed by EMBL as a tool for 3D surface visualization, as well as for other types of visualization. An open training on `ili is planned for 2016 by EMBL and UCSD together.
• UR1 is organizing 1st Workshop on Imaging Mass Spectrometry (WIMS) in March 2016 in St. Malo, France, as a practical course and hands-on training on imaging MS.
• UoB and SCiLS started collaborating with Bruker Daltonics (a leading vendor of imaging MS equipment) along two directions: 1) using the framework for optimizing parameters of data acquisition based on the statistical analysis of variation, 2) using the newly developed mass spectrometer rapiflex (Bruker) for 3D MALDI-imaging MS, since rapiflex has an innovative MALDI source allowing data acquisition 20 times faster than earlier (40 pixels per second) that makes data acquisition for 3D MALDI-imaging substantially faster (overnight instead of several weeks).
• SagivTech is actively seeking opportunities for exploiting the know-how gained in the project and for commercialization of the developed library of the GPU-optimized algorithms.

List of Websites:
For more information, please visit our website http://3D-MASSOMICS.eu or contact the coordinator Theodore Alexandrov at theodore.alexandrov@embl.de or at EMBL, Meyerhofstr. 1, 69117 Heidelberg, Germany.

final1-3dm_final-report-31012016-final_full.pdf