Skip to main content
Weiter zur Homepage der Europäischen Kommission (öffnet in neuem Fenster)
Deutsch Deutsch
CORDIS - Forschungsergebnisse der EU
CORDIS
Inhalt archiviert am 2024-05-27

CONFIDENTIALITY OF DATA AGAINST DATA MINING METHODS

Ziel

Securing data against intruders attacking implicit sensitive information is an open research problem. In order to make a publicly available system secure, we must ensure not only that private sensitive data have been trimmed out, but also that certain inference channels have been blocked as well. Moreover, the need to make a system as open as possible - to the degree that data sensitivity is not jeopardised - asks for techniques that account for the disclosure control of sensitive data. We aim at investigating various data mining methods as a threat to data security. We plan to evaluate the initial work on data mining against data security and then investigate possible techniques to ensure data confidentiality against a wide spectrum of data mining methodologies and novel information types. Securing data against intruders attacking implicit sensitive information is an open research problem. In order to make a publicly available system secure, we must ensure not only that private sensitive data have been trimmed out, but also that certain inference channels have been blocked as well. Moreover, the need to make a system as open as possible - to the degree that data sensitivity is not jeopardised - asks for techniques that account for the disclosure control of sensitive data. We aim at investigating various data mining methods as a threat to data security. We plan to evaluate the initial work on data mining against data security and then investigate possible techniques to ensure data confidentiality against a wide spectrum of data mining methodologies and novel information types.

OBJECTIVES
(a) Investigation of new techniques for secure data mining that will cover the main aspects of data mining (association rules, classification, clustering);
(b) Often data is spatially distributed, so special attention will be given to the investigation of new techniques for secure distributed data mining;
(c) The existing and newly investigated techniques for secure data mining will be implemented and tested thoroughly against real data sets for their effectiveness and against synthetic data sets for their performance;
(d) Specification of an evaluation framework in order to compare all the techniques in a common platform which will be the basis for determining the appropriate technique for a given type of application;
(e) Constructing the know-how for possible threats against data security that can be caused by data mining tools and how they could be overcome.

DESCRIPTION OF WORK
There can be various disclosure methodologies depending on the data mining technique in use. Information disclosure control techniques that we are going to investigate can be summarised as follows:
(a) A data hiding technique which is suitable for association rule hiding and prevention of the prediction of confidential data via decision trees;
(b) Inserting Unknown Values can be used when perturbing the data or inserting wrong values may cause serious problems as in the case of medical data;
(c) Data perturbation techniques can be used to modify the data to preserve the confidentiality yet letting the approximately correct data mining model to be extracted;
(d) Data Swapping techniques shuffle the data values in the same column and can be used in cases where data removal can reveal some hints on confidential data mining results;
(e) Data Alteration changes values randomly.
The distributed nature of most of the data encountered in practice, motivated the research on distributed data mining.
Therefore, by considering a distributed scenario of data we propose methodologies for security of this data against data mining techniques. Data warehousing technology will be investigated as well in this context. The sanitation process should work in a way that maximises the degree of security in terms of sensitive information while trying to keep the data quality as high as possible. An optimisation of data quality and degree of sanitation will be investigated in this project. Various application areas to be considered are the regular disclosure of data, secure outsourcing of sensitive data, secure data trade among companies, combined data analysis prior to company mergers, secure disclosure of protein sequences and DNA data.
The work on securing the data against intruders attacking the implicit sensitive information in the data has just started and is yet to cover the broad spectrum of data mining techniques. In order to make a publicly available system secure, we must ensure not only that private sensitive data have been trimmed out, but also to make sure that certain inference channels have been blocked as well. In other words it is not only the data but the hidden knowledge in this data, which should be made secure.

Moreover, the need for making our system as open as possible
- to the degree that data sensitivity is not jeopardised
- asks for various techniques that account for the disclosure control of sensitive data. We have considered some aspects of data like dimensionality and distribution, as well as some data mining methods as a threat to data security. The plan that was set out in this project was to evaluate the initial work on data mining against data security and then to investigate possible techniques to ensure data confidentiality against a wide spectrum of disclosure methodologies and novel information types. We have reached at a stage where a selected set of privacy preserving data mining algorithms has been developed in the prototype system but further resources are needed for exploiting this new research area fully. We feel that the continuation of this project from an evaluation to a regular phase will provide us with the best possible resources to investigate further this interesting research area, and place ourselves among the pioneers at an international level. In this way we hope that we will be in a position to contribute to the field with the highest potential and formulate a network of excellence in the field of privacy and security of data and information.

Wissenschaftliches Gebiet (EuroSciVoc)

CORDIS klassifiziert Projekte mit EuroSciVoc, einer mehrsprachigen Taxonomie der Wissenschaftsbereiche, durch einen halbautomatischen Prozess, der auf Verfahren der Verarbeitung natürlicher Sprache beruht. Siehe: Das European Science Vocabulary.

Sie müssen sich anmelden oder registrieren, um diese Funktion zu nutzen

Programm/Programme

Mehrjährige Finanzierungsprogramme, in denen die Prioritäten der EU für Forschung und Innovation festgelegt sind.

Thema/Themen

Aufforderungen zur Einreichung von Vorschlägen sind nach Themen gegliedert. Ein Thema definiert einen bestimmten Bereich oder ein Gebiet, zu dem Vorschläge eingereicht werden können. Die Beschreibung eines Themas umfasst seinen spezifischen Umfang und die erwarteten Auswirkungen des finanzierten Projekts.

Aufforderung zur Vorschlagseinreichung

Verfahren zur Aufforderung zur Einreichung von Projektvorschlägen mit dem Ziel, eine EU-Finanzierung zu erhalten.

Daten nicht verfügbar

Finanzierungsplan

Finanzierungsregelung (oder „Art der Maßnahme“) innerhalb eines Programms mit gemeinsamen Merkmalen. Sieht folgendes vor: den Umfang der finanzierten Maßnahmen, den Erstattungssatz, spezifische Bewertungskriterien für die Finanzierung und die Verwendung vereinfachter Kostenformen wie Pauschalbeträge.

ACM - Preparatory, accompanying and support measures

Koordinator

RESEARCH ACADEMIC COMPUTER TECHNOLOGY INSTITUTE
EU-Beitrag
Keine Daten
Adresse
61, RIGA FERAIOU STREET
26221 PATRAS
Griechenland

Auf der Karte ansehen

Gesamtkosten

Die Gesamtkosten, die dieser Organisation durch die Beteiligung am Projekt entstanden sind, einschließlich der direkten und indirekten Kosten. Dieser Betrag ist Teil des Gesamtbudgets des Projekts.

Keine Daten

Beteiligte (2)

Mein Booklet 0 0