Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Provenance for Data-Intensive Systems

Project description

New tools to assess data reliability

Imagine a world where computation results are accounted for and explained: systems would be transparent and controllable, and the results credible and reusable. Data provenance or tracking is the ability to track information to the original source and assess the reliability of the information or the source. The EU-funded ProDIS project will develop models, algorithms and tools that facilitate provenance tracking for a wide range of data-intensive systems. It will support the provenance for data exploration and data science, as well as other data analytics frameworks. It will also address the computational overhead incurred by provenance tracking. ProDIS also aims to develop a user-friendly tool for provenance-based analysis and experimental validation based on the development of prototype tools and benchmarks.

Objective

In the context of data-intensive systems, data provenance captures the way in which data is used, combined
and manipulated by the system. Provenance information can for instance be used to reveal whether
data was illegitimately used, to reason about hypothetical data modifications, to assess the trustworthiness
of a computation result, or to explain the rationale underlying the computation.
As data-intensive systems constantly grow in use, in complexity and in the size of data they manipulate,
provenance tracking becomes of paramount importance. In its absence, it is next to impossible to follow the
flow of data through the system. This in turn is extremely harmful for the quality of results, for enforcing
policies, and for the public trust in the systems.
Despite important advancements in research on data provenance, and its possible revolutionary impact,
it is unfortunately uncommon for practical data-intensive systems to support provenance tracking. The
goal of the proposed research is to develop models, algorithms and tools that facilitate provenance
tracking for a wide range of data-intensive systems, that can be applied to large-scale data analytics,
allowing to explain and reason about the computation that took place.
Towards this goal, we will address the following main objectives: (1) supporting provenance for modern
data analytics frameworks such as data exploration and data science, (2) overcoming the computational
overhead incurred by provenance tracking, (3) the development of user-friendly, provenance-based analysis
tools and (4) experimental validation based on the development of prototype tools and benchmarks.

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

ERC-STG - Starting Grant

See all projects funded under this funding scheme

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

(opens in new window) ERC-2018-STG

See all projects funded under this call

Host institution

TEL AVIV UNIVERSITY
Net EU contribution

Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.

€ 1 306 250,00
Total cost

The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.

€ 1 306 250,00

Beneficiaries (1)

My booklet 0 0