Skip to main content

Enabling End-User Data warehouse Mining

Objective

The project aims at new techniques that give decision-makers direct access to information stored in databases, data warehouses, and knowledge bases. The main goal is the integration of data and knowledge management. Discovery techniques produce knowledge from very large sets of distributed data. They exploit domain knowledge in order to deliver more concise and relevant insights. The main obstacle to achieve this goal is the problem of finding the proper representation for a discovery task.

The project will develop new techniques that support user-guided representation adjustment as well as techniques that automatically select or change representations. Cases of successful uses of particular representations for certain discovery tasks are stored and provide users with an adaptive interface to information. An advanced data mining system, a case-base of typical discovery tasks, and new operators for pre-processing will be the project's results.

Objectives:
An environment for knowledge discovery from databases (KDDE) will be developed that provides decision-makers with advanced knowledge extraction form large distributed data sets. New techniques for selecting and constructing features on the basis of given data will be developed. For instance, ways of handling time (time series, relations of time intervals, validity of discovered rules), discovering hidden variables, and detecting interdependencies among features will be investigated. The techniques ease knowledge discovery where currently most time is spent in pre-processing. Domain knowledge will be exploited by data mining. This will enhance the quality of data mining results. A case-base of discovery tasks together with the required pre-processing techniques will offer an adaptive interface to the KDDE. This will speed-up similar applications of knowledge discovery and make the KDDE self-improving.

Work description:
The scientific research for enabling end-users to gain knowledge from databases and data warehouses is organised in two themes: a meta-data model and multi-strategy learning. The meta-data offer constraints for pre-processing and pairing business tasks with algorithms (WP1, WP8, WP10, WP18). A deep analysis of feature selection, sampling, transformation and mining operators is developed. Multi-strategy learning systematically explores the combinations and (automatic) parameter settings of diverse learning operators for pre-processing, particularly for feature selection and construction (WP4, WP13, WP14). Handling of multi-relational data (WP15), time phenomena (WP3) and the inclusion of domain knowledge (WP5) enhance discovery. The technological achievement is centred around an advanced KDD supporting environment (WP1, WP2, WP7, WP12, WP16). Scientific and technological efforts yield a case base of best-practice discovery (WP10) that can be used by users of the environment and is published in the Internet for an international "representation race".
Applications guarantee that research and technology focus on the most challenging and demanded issues. The data warehouse provided by one partner and a set of data mining applications from data mining consultancy of two of the partners evaluate the transferability of results.

Milestones:
- Milestone 1 delivers a first prototype of a KDD support environment, application areas being set up and their demands being specified.
- Milestone 2 delivers multi-relational data handling and learning the setting of learning parameters. In addition, meta-data for user-driven data transformations and known learning operators will be implemented by an environment for pre-processing.
- Milestone 3 delivers new methods for automatic pre-processing and a case-base that is used by the KDDSE.

Funding Scheme

CSC - Cost-sharing contracts

Coordinator

UNIVERSITAET DORTMUND
Address
August-schmidt-strasse 4
44227 Dortmund
Germany

Participants (7)

FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
Germany
Address
Hansastrasse 27C
80686 Muenchen
NATIONAL INSTITUTE OF TELECOMMUNICATIONS
Poland
Address
Ul. Szachowa 1
04-894 Warszawa
PEROT SYSTEMS NEDERLAND B.V.
Netherlands
Address
Hoefseweg 1
3800 GG Amersfoort
SCHWEIZERISCHE LEBENSVERSICHERUNGS- UND RENTENANSTALT
Switzerland
Address
General Guisan-quai 40
8002 Zuerich
TELECOM ITALIA LAB S.P.A.
Italy
Address
Via G. Reiss Romoli 274
10148 Torino
UNIVERSITA DEGLI STUDI DEL PIEMONTE ORIENTALE AMEDEO AVOGADRO
Italy
Address
Via Duomo 6
13100 Vercelli
VYSOKA SKOLA EKONOMICKA V PRAZE
Czechia
Address
Nam Winston Churchilla 4
130 67 Praha