Skip to main content

Inductive Queries for Mining Patterns and Models

Objective

Given the present distinct lack of a generally accepted framework for data mining, the quest for such a framework is a major research priority. The most promising approach to this task is taken by inductive databases (IDBs), which contain not only data, but also patterns. Patterns can be either local patterns, such as frequent itemsets, which are of descriptive nature, or global models, such as decision trees, which are of predictive nature. In an IDB, inductive queries can be used to generate (mine), manipulate, and apply patterns. The IDB framework is appealing as a theory for data mining, because it employs declarative queries instead of ad hoc procedural constructs. Declarative queries are often formulated using constraints and inductive querying is closely related to constraint-based data mining. The IDB framework is also appealing for data mining applications, as it supports the process of knowledge discovery in databases (KDD): the results of one (inductive) query can be used as input for another and nontrivial multistep KDD scenarios can be supported, rather than just single data mining operations. The state-of-the-art in IDBs is that there exist various effective approaches to constraint-based mining (inductive querying) of local patterns, such as frequent item sets and sequences, most of which work in isolation. The proposed project aims to significantly advance the state-of-the-art by developing the theory of and practical approaches to inductive querying (constraint-based mining) of global models, as well as approaches to answering complex inductive queries that involve both local patterns and global models. Based on these, showcase applications/IDBs in the area of bioinformatics will be developed, where users will be able to query data about drug activity, gene expression, gene function and protein sequences, as well as frequent patterns (e.g., subsequences in proteins) and predictive models (e.g., for drug activity or gene function).

Funding Scheme

STREP - Specific Targeted Research Project

Coordinator

JOZEF STEFAN INSTITUTE
Address
Jamova 39
1001 Ljubljana
Slovenia

Participants (5)

ALBERT-LUDWIGS-UNIVERSITAET FREIBURG
Germany
Address
Georges-köhler-allee, Building 079
79110 Freiburg
HELSINGIN YLIOPISTO
Finland
Address
Gustaf Hällströmin Katu 2B
00014 University Of Helsinki
INSTITUT NATIONAL DES SCIENCES APPLIQUEES DE LYON
France
UNIVERSITEIT ANTWERPEN
Belgium
Address
Prinsstraat 13
2000 Antwerpen
UNIVERSITY OF WALES, ABERYSTWYTH
United Kingdom
Address
Old College, King Street
SY23 3DB Aberystwyth