Cancer is the second cause of death in Europe. New techniques using mixed data from the genome, transcriptome and proteome can be used for cancer detection and evaluation of therapy resulting in better diagnosis and treatment efficiency. To become widespread these techniques require integration of fast and intelligent processing, and easy control and access to analysis data and results. The project proposes the realization of a platform integrating the most recent advances in data processing that will be validated by 3 European oncology hospitals owning very rich but under-exploited tumours and clinical databases. The created web based platform will allow formalization of re-usable, simple to operate, complex chaining of data mining and data morphing treatments. To reach this goal, the project is organized in a definition phase, followed by a realization phase and ended by an evaluation phase. This last phase prepares further future redeployment.
Cancer is the second leading cause of death in Europe. Molecular classification starts to be used for cancer detection and response studies to therapy. The long-term goal is to have individual treatment, based on molecular profiling patient's tumours. Project's objective is to contribute to this goal by defining, developing and validating a platform that:
1)supports the formalisation and re-use of experimentation methodologies (i.e. the workflow);
2)supports intelligent processing of very large volumes of heterogeneous data (i.e. data flow);
3)supports a bi-directional exchange through the web with established worldwide genes information databases (i.e. information flow);
4)is validated on a set of oncology related scenarios. Functions of this platform include: integration and use of clinical and biological data, very fast data processing, integration of existing and new data mining algorithms.
DESCRIPTION OF WORK
The work has been organised in 3 major phases. The first phase consists in defining the platform functions starting from the end-user requirements. This phase also includes the creation of the data model for the experimentations. Then the realisation phase ends with a prototype of the platform used by last phase activities of experimentation. The definition phase will result in the list of data sources the project will address for experimentation. A cancer data model based on the available data in the 3 oncology centres of the consortium will be build and a detailed architecture definition of the platform prototype will be edited. The implementation phase results in the platform prototype supporting: The modelling of medical and biological data, which differ in format, nature (clinical data, images, biological data), origin (local data, published data, public data) and volume. The design of strongly optimised data treatments able to manage very quickly large volumes of data. The integration of analysis methods that take into account the diversity of the data and combine existing data-mining algorithms (commercial and public domain), including statistical analysis, data clustering and other classification tools through a plug-in mechanism. ? The definition and establishment of a user-friendly web-based user interface allowing health professional to interactively and quickly retrieve needed information. The experimentation phase has two goals: the first one is to validate the technical integration of tools and databases in the platform, the second goal is to validate the approach in oncology studies in the cases of various cancers (like breast, bladder, pancreatic, colon, etc.), leading to the proposal of 'predictive diagnostic genes'.
Funding SchemeCSC - Cost-sharing contracts
75794 Paris Cedex 16
75794 Paris Cedex 16
91405 Orsay Cedex