CONFIDENTIALITY OF DATA AGAINST DATA MINING METHODS

Ziel

Securing data against intruders attacking implicit sensitive information is an open research problem. In order to make a publicly available system secure, we must ensure not only that private sensitive data have been trimmed out, but also that certain inference channels have been blocked as well. Moreover, the need to make a system as open as possible - to the degree that data sensitivity is not jeopardised - asks for techniques that account for the disclosure control of sensitive data. We aim at investigating various data mining methods as a threat to data security. We plan to evaluate the initial work on data mining against data security and then investigate possible techniques to ensure data confidentiality against a wide spectrum of data mining methodologies and novel information types. Securing data against intruders attacking implicit sensitive information is an open research problem. In order to make a publicly available system secure, we must ensure not only that private sensitive data have been trimmed out, but also that certain inference channels have been blocked as well. Moreover, the need to make a system as open as possible - to the degree that data sensitivity is not jeopardised - asks for techniques that account for the disclosure control of sensitive data. We aim at investigating various data mining methods as a threat to data security. We plan to evaluate the initial work on data mining against data security and then investigate possible techniques to ensure data confidentiality against a wide spectrum of data mining methodologies and novel information types.

OBJECTIVES
(a) Investigation of new techniques for secure data mining that will cover the main aspects of data mining (association rules, classification, clustering);
(b) Often data is spatially distributed, so special attention will be given to the investigation of new techniques for secure distributed data mining;
(c) The existing and newly investigated techniques for secure data mining will be implemented and tested thoroughly against real data sets for their effectiveness and against synthetic data sets for their performance;
(d) Specification of an evaluation framework in order to compare all the techniques in a common platform which will be the basis for determining the appropriate technique for a given type of application;
(e) Constructing the know-how for possible threats against data security that can be caused by data mining tools and how they could be overcome.

DESCRIPTION OF WORK
There can be various disclosure methodologies depending on the data mining technique in use. Information disclosure control techniques that we are going to investigate can be summarised as follows:
(a) A data hiding technique which is suitable for association rule hiding and prevention of the prediction of confidential data via decision trees;
(b) Inserting Unknown Values can be used when perturbing the data or inserting wrong values may cause serious problems as in the case of medical data;
(c) Data perturbation techniques can be used to modify the data to preserve the confidentiality yet letting the approximately correct data mining model to be extracted;
(d) Data Swapping techniques shuffle the data values in the same column and can be used in cases where data removal can reveal some hints on confidential data mining results;
(e) Data Alteration changes values randomly.
The distributed nature of most of the data encountered in practice, motivated the research on distributed data mining.
Therefore, by considering a distributed scenario of data we propose methodologies for security of this data against data mining techniques. Data warehousing technology will be investigated as well in this context. The sanitation process should work in a way that maximises the degree of security in terms of sensitive information while trying to keep the data quality as high as possible. An optimisation of data quality and degree of sanitation will be investigated in this project. Various application areas to be considered are the regular disclosure of data, secure outsourcing of sensitive data, secure data trade among companies, combined data analysis prior to company mergers, secure disclosure of protein sequences and DNA data.
The work on securing the data against intruders attacking the implicit sensitive information in the data has just started and is yet to cover the broad spectrum of data mining techniques. In order to make a publicly available system secure, we must ensure not only that private sensitive data have been trimmed out, but also to make sure that certain inference channels have been blocked as well. In other words it is not only the data but the hidden knowledge in this data, which should be made secure.

Moreover, the need for making our system as open as possible
- to the degree that data sensitivity is not jeopardised
- asks for various techniques that account for the disclosure control of sensitive data. We have considered some aspects of data like dimensionality and distribution, as well as some data mining methods as a threat to data security. The plan that was set out in this project was to evaluate the initial work on data mining against data security and then to investigate possible techniques to ensure data confidentiality against a wide spectrum of disclosure methodologies and novel information types. We have reached at a stage where a selected set of privacy preserving data mining algorithms has been developed in the prototype system but further resources are needed for exploiting this new research area fully. We feel that the continuation of this project from an evaluation to a regular phase will provide us with the best possible resources to investigate further this interesting research area, and place ourselves among the pioneers at an international level. In this way we hope that we will be in a position to contribute to the field with the highest potential and formulate a network of excellence in the field of privacy and security of data and information.

Wissenschaftliches Gebiet (EuroSciVoc)

CORDIS klassifiziert Projekte mit EuroSciVoc, einer mehrsprachigen Taxonomie der Wissenschaftsbereiche, durch einen halbautomatischen Prozess, der auf Verfahren der Verarbeitung natürlicher Sprache beruht. Siehe: Das European Science Vocabulary.

Programm/Programme

Mehrjährige Finanzierungsprogramme, in denen die Prioritäten der EU für Forschung und Innovation festgelegt sind.

FP5-IST - Programme for research, technological development and demonstration on a "User-friendly information society, 1998-2002"

Thema/Themen

Aufforderungen zur Einreichung von Vorschlägen sind nach Themen gegliedert. Ein Thema definiert einen bestimmten Bereich oder ein Gebiet, zu dem Vorschläge eingereicht werden können. Die Beschreibung eines Themas umfasst seinen spezifischen Umfang und die erwarteten Auswirkungen des finanzierten Projekts.

1.1.2.-6.1.1 - FET O: Open domain

Aufforderung zur Vorschlagseinreichung

Verfahren zur Aufforderung zur Einreichung von Projektvorschlägen mit dem Ziel, eine EU-Finanzierung zu erhalten.

Daten nicht verfügbar

Finanzierungsplan

Finanzierungsregelung (oder „Art der Maßnahme“) innerhalb eines Programms mit gemeinsamen Merkmalen. Sieht folgendes vor: den Umfang der finanzierten Maßnahmen, den Erstattungssatz, spezifische Bewertungskriterien für die Finanzierung und die Verwendung vereinfachter Kostenformen wie Pauschalbeträge.

ACM - Preparatory, accompanying and support measures

Koordinator

RESEARCH ACADEMIC COMPUTER TECHNOLOGY INSTITUTE

EU-Beitrag

Keine Daten

Adresse

61, RIGA FERAIOU STREET
26221 PATRAS
Griechenland

Gesamtkosten

Keine Daten

Beteiligte (2)

SABANCI UNIVERSITY

Türkei

EU-Beitrag

Keine Daten

Adresse

SABANCI UNIVERSITY
81474 ORHANLI, TUZLA, ISTANBUL

Gesamtkosten

Keine Daten

UNIVERSITA DEGLI STUDI DI MILANO

Italien

EU-Beitrag

Keine Daten

Adresse

VIA FESTA DEL PERDONO 7
20122 MILANO

Gesamtkosten

Keine Daten

Ziel

Wissenschaftliches Gebiet (EuroSciVoc)

CORDIS klassifiziert Projekte mit EuroSciVoc, einer mehrsprachigen Taxonomie der Wissenschaftsbereiche, durch einen halbautomatischen Prozess, der auf Verfahren der Verarbeitung natürlicher Sprache beruht. Siehe: Das European Science Vocabulary.

Programm/Programme

Mehrjährige Finanzierungsprogramme, in denen die Prioritäten der EU für Forschung und Innovation festgelegt sind.

Aufforderung zur Vorschlagseinreichung

Verfahren zur Aufforderung zur Einreichung von Projektvorschlägen mit dem Ziel, eine EU-Finanzierung zu erhalten.

Koordinator

Beteiligte (2)

Diese Seite teilen Diese Seite in sozialen Netzwerken teilen

Herunterladen Den Inhalt der Seite herunterladen

CONFIDENTIALITY OF DATA AGAINST DATA MINING METHODS

Ziel

Wissenschaftliches Gebiet (EuroSciVoc) CORDIS klassifiziert Projekte mit EuroSciVoc, einer mehrsprachigen Taxonomie der Wissenschaftsbereiche, durch einen halbautomatischen Prozess, der auf Verfahren der Verarbeitung natürlicher Sprache beruht. Siehe: Das European Science Vocabulary.

Programm/Programme Mehrjährige Finanzierungsprogramme, in denen die Prioritäten der EU für Forschung und Innovation festgelegt sind.

Aufforderung zur Vorschlagseinreichung Verfahren zur Aufforderung zur Einreichung von Projektvorschlägen mit dem Ziel, eine EU-Finanzierung zu erhalten.

Koordinator

Beteiligte (2)

Diese Seite teilen Diese Seite in sozialen Netzwerken teilen

Herunterladen Den Inhalt der Seite herunterladen

Wissenschaftliches Gebiet (EuroSciVoc)

CORDIS klassifiziert Projekte mit EuroSciVoc, einer mehrsprachigen Taxonomie der Wissenschaftsbereiche, durch einen halbautomatischen Prozess, der auf Verfahren der Verarbeitung natürlicher Sprache beruht. Siehe: Das European Science Vocabulary.

Programm/Programme

Mehrjährige Finanzierungsprogramme, in denen die Prioritäten der EU für Forschung und Innovation festgelegt sind.

Aufforderung zur Vorschlagseinreichung

Verfahren zur Aufforderung zur Einreichung von Projektvorschlägen mit dem Ziel, eine EU-Finanzierung zu erhalten.