e-Laboratory for Interdisciplinary Collaborative Research in Data Mining and Data-Intensive Sciences

Project description

Intelligent Content and Semantics
e-LICO offers a data mining lab to scientists struggling to analyse massive data spawned by high-throughput technologies

The goal of the e-LICO project is to build a virtual laboratory for interdisciplinary collaborative research in data mining and data-intensive sciences. The proposed e-lab comprises three layers: the e-science layer and the data mining layer form a generic knowledge discovery platform that can be adapted to different scientific domains by customizing the application layer. The project's overall research strategy can be summarized as the bottom-up construction of this three-tiered architecture.

The foundation of the e-science layer is a suite of open-source components developed by the University of Manchester (e.g. myGrid e-science platform, Taverna workflow editor); these components will be extended with tools for content creation (e.g. semantic annotation, ontology engineering) as well as mechanisms for multiple levels and modes of collaboration in experimental research.

The data mining layer is the distinctive core of e-LICO; it will provide a comprehensive set of multimedia (structured records, text, images, signals) data mining tools. Standard tools will be complemented with preprocessing or learning algorithms developed specifically to respond to problems of data-intensive, knowledge rich sciences, such as extremely high dimensionality and undersampling, learning from heterogeneous data, incorporating prior knowledge into learning. Methodologically sound use of these tools will be ensured by a knowledge-driven, planner-based data mining assistant, which will rely on a data mining ontology to plan the data mining process and propose ranked workflows for a given application problem. Extensive e-lab monitoring facilities will support comparison and analysis of experiments by a meta-miner, which will combine probabilistic reasoning with kernel-based learning to incrementally improve the assistant's workflow recommendations.

The application layer is always domain-specific. In the generic e-lab, the application layer is an empty shell. It is built by the domain user who will use the tools available in the e-science and DM layers to access available services and resources (e.g. knowledge bases, ontologies) or develop new ones; design, run and analyse data mining workflows; and semantically annotate experimental data as well as mined models in domain-specific terms.

The data mining e-lab will be showcased on a systems biology task: biomarker discovery and pathway modelling for diseases affecting the kidney and urinary pathways (KUP). Domain-specific knowledge sources, such as a specialized ontology and a data base on KUP-related diseases will be collaboratively authored by European specialists in the area. Multi-omic (e.g. genomic, transcriptomic, proteomic, metabolomic) data provided by biologists and clinicians gathered in COST Action BM0702 (EuroKUP) will be mined and the resulting diagnostic/prognostic models made available in a repository of data mining experiments.

The final deliverable of the project will be a free, experimental prototype open to continuous collaborative expansion and refinement by the research community.

The goal of the e-LICO project is to build a virtual laboratory for interdisciplinary collaborative research in data mining and data-intensive sciences. The proposed e-lab will comprise three layers: the e-science and data mining layers will form a generic research environment that can be adapted to different scientific domains by customizing the application layer. The e-science layer, built on an open-source e-science infrastructure developed by one of the partners, will support content creation through collaboration at multiple scales and degrees of commitment---ranging from small, contract-bound teams to voluntary, constraint-free participation in dynamic virtual communities. The data mining layer will be the distinctive core of e-LICO; it will provide comprehensive multimedia (structured records, text, images, signals) data mining tools. Standard tools will be augmented with preprocessing or learning algorithms developed specifically to meet challenges of data-intensive, knowledge rich sciences, such as ultra-high dimensionality or undersampled data. Methodologically sound use of these tools will be ensured by a knowledge-driven data mining assistant, which will rely on a data mining ontology and knowledge base to plan the mining process and propose ranked workflows for a given application problem. Extensive e-lab monitoring facilities will automate the accumulation of experimental meta-data to support replication and comparison of data mining experiments. These meta-data will be used by a meta-miner, which will combine probabilistic reasoning with kernel-based learning from complex structures to incrementally improve the assistant's workflow recommendations. e-LICO will be showcased in a systems biology task: biomarker discovery and molecular pathway modelling for diseases affecting the kidney and urinary pathways.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

FP7-ICT - Specific Programme "Cooperation": Information and communication technologies

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

ICT-2007.4.4 - Intelligent content and semantics (ICT-2007.4.4)

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

FP7-ICT-2007-3
See other projects for this call

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

CP - Collaborative project (generic)

Coordinator

UNIVERSITE DE GENEVE

EU contribution

€ 674 532,00

Address

RUE DU GENERAL DUFOUR 24
1211 Geneve
Switzerland

Region

Schweiz/Suisse/Svizzera Région lémanique Genève

Activity type

Higher or Secondary Education Establishments

Links

Contact the organisation

Website

Participation in EU R&I programmes

HORIZON collaboration network

Total cost

No data

Participants (9)

UNIVERSITAT ZURICH

EU contribution

€ 495 773,00

RAPIDMINER GMBH

Germany

EU contribution

€ 395 252,00

MEDICEL OY

Finland

EU contribution

€ 418 620,00

INSTITUT NATIONAL DE LA SANTE ET DE LA RECHERCHE MEDICALE

France

EU contribution

€ 179 877,00

ETHNIKO IDRYMA EREVNON

Greece

EU contribution

€ 280 481,00

RUDER BOSKOVIC INSTITUTE

Croatia

EU contribution

€ 144 840,00

POLITECHNIKA POZNANSKA

Poland

EU contribution

€ 126 200,00

INSTITUT JOZEF STEFAN

Slovenia

EU contribution

€ 206 363,00

THE UNIVERSITY OF MANCHESTER

United Kingdom

EU contribution

€ 495 465,00

Project description

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (9)

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.