Objective
Data exchange and data publishing is an inherent component of our interconnected world. Industrial companies outsource datasets to marketing and mining firms in order to support business intelligence; medical institutions exchange collected clinical experiments; academic institutions create repositories and share datasets for promoting research collaboration. A common denominator in any data exchange is the 'transformation' of the original data, which usually results in 'distortion' of data. While accurate and useful information can be potentially distilled from the original data, operations such as anonymization, rights protection and compression result in modified datasets that very seldom retain the mining capacity of its original source. This proposal seeks to address questions such as the following:
- How can we lossy compress datasets and still guarantee that mining operations are not distorted?
- Is it possible to right protect datasets and provide assurances that this task shall not impair our ability to distill useful knowledge?
- To what extent can we resolve data anonymization issues and yet retain the mining capacity of the original dataset?
We will examine a fundamental and hard problem in the area of knowledge discovery, which is the delicate balance between data transformation and data utility under mining operations. The problem lies at the confluence of many areas, such as machine and statistical learning, information theory, data representation and optimization. We will focus on studying data transformation methods (compression, anonymization, right protection) that guarantee the preservation of the salient dataset characteristics, such that data mining operations on original and transformed dataset are retained as well as possible. We will investigate how graph-centric approaches, clustering, classification and visualization algorithms can be ported to work under the proposed mining-preservation paradigm. Additional research challenges i
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: https://op.europa.eu/en/web/eu-vocabularies/euroscivoc.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: https://op.europa.eu/en/web/eu-vocabularies/euroscivoc.
- natural sciences computer and information sciences data science data mining
- natural sciences computer and information sciences data science business intelligence
- natural sciences computer and information sciences data science data exchange
You need to log in or register to use this function
We are sorry... an unexpected error occurred during execution.
You need to be authenticated. Your session might have expired.
Thank you for your feedback. You will soon receive an email to confirm the submission. If you have selected to be notified about the reporting status, you will also be contacted when the reporting status will change.
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
ERC-2010-StG_20091028
See other projects for this call
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Host institution
8803 RUESCHLIKON
Switzerland
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.