Skip to main content

Development of Representation for Machine Learning from Imperfect Information

Objective

The ECOLES Action studied the question of how to acquire knowledge and then maintain it within computer systems. Particularly in control tasks, the intuitive nature of human know-how presents a seemingly insuperable barrier to knowledge acquisition. By constructing systems able to learn from various data sources, machine learning aims to overcome the knowledge acquisition problems encountered in current software technology. In particular, the automatic construction of new representations is a main objective of this Action, as state-of-the-art methods in machine learning are critically limited by the representation of given problems. .
Fundamental problems in machine learning are being studied with the aim of developing logic based machine learning techniques that will enable the repair of knowledge bases that are incomplete, incorrect or ineffective.

Inductive logic programming has been fostered. The field has a strong theoretical foundation inherited from logic programming together with an experimental orientation from machine learning. The GOLEM inductive system has been tested on hard problems, including multiagent environments, qualitative modelling problems, protein folding application and noisy data.

Research has been carried out in the following areas:
architecture for multiagents learing (a new scheme for knowledge integration and a new architecture based on Candidate Theory;
incremental learning (rule revision by an incremental learning scheme);
knowledge modules (a method based on correctness and coverage allowing better knowledge integration);
learning problem solver heuristics (the aim being to generate heuristics for a problem solver defined by a terminal condition, a state space and a set of operators);
conceptual clustering techniques (conceptual graphs improve learning in CHARADE and background knowledge and saturation methods are used);
Metaclauses (compilation of control expressed as metaclauses in the YAM, a meta interpreter system on SOARLOG);
learning qualitative models of dynamic systems;
relational descriptions (inductive learning of relational description from noisy examples);
learning rules for early diagnosis of rheumatic diseases (results obtained use the induction learning system LINUS, incorporate ASSISTANT and implement background knowledge provided by medical specialists).
APPROACH AND METHODS
The approach taken in this Action has been to develop logic-based machine learning techniques that will enable the repair of knowledge-bases and databases that are either:
-ineffective, such that certain questions can in theory be answered correctly, but the derivation of these answers is intractable within the given computational resources
-incomplete, in that there exist questions whose answers are not derivable given the present state of knowledge
-incorrect, in that certain questions are answered incorrectly given the present state of knowledge.
Solving these problems will lead to the identification and evaluation of principles, methods and topologies that can be used in applications.
PROGRESS AND RESULTS
-Inductive logic programming has been fostered. The field has a strong theoretical foundation inherited from logic programming together with an experimental orientation from machine learning. The Golem. inductive system has been tested on hard problems,including multi-agent environments, qualitative modelling problems, protein folding application, and noisy data.
-Architecture for multi-agents learning. A new scheme for knowledge integration and a new architecture based on "Candidate Theory" are being investigated.
-Incremental learning. Rule revision by incremental learning scheme has been developed and tested. Rules that have returned to an agent (when new data reveal that these are sufficiently relevant) are accompanied by a set of relevant cases that help the a gent in the process of revision.
-Knowledge modules. A method based on correctness and coverage is being investigated to allow better knowledge integration.
-Learning problem-solver heuristics. The aim is to generate heuristics for a problem-solver defined by a terminal condition, a state space and a set of operators. The results obtained are encouraging and tests are being carried out on one player games of slide and jump and the Hanoi tower.
-Conceptual clustering techniques. Conceptual graphs are used to improve learning in CHARADE. Background knowledge and saturation methods are used. The construction of ordering structure has been shown to be fully incremental and the cost introducing new examples is linear with the number of descriptors it contains.
-Meta-clauses. Compilation of control expressed as meta clauses in the YAM (a meta interpreter) system on SOARLOG is complete. Efficiency of the interpreter has been achieved by introducing control predicates such as freeze and cut. YAM uses transformati on techniques avoiding the necessary use of preference clauses.
-Learning qualitative models of dynamic systems. The QSIM formalism is used as a representation for learned qualitative models. The problem of learning QSIM-type models is formulated in logic, and the GOLEM learning program used for induction.
-Relational descriptions. Inductive learning of relational description from noisy examples has been investigated. LINUS and FOIL have been extended to learning restricted Horn clauses with negation. Their performance is measured on a problem of using ill egal position in a chess end game. Both systems performed well.
-Learning rules for early diagnosis of rheumatic diseases. Results obtained using the induction learning system LINUS, incorporating ASSISTANT, using background knowledge provided by medical specialists have shown that the approach is very effective in h andling noise.
POTENTIAL
The algorithms, programmes and methods developed will accelerate the rate at which the latest advances in machine learning are incorporated into systems with direct industrial applications. Examples are knowledge acquisition in expert systems, dynamic process control applications, and health care applications.

Coordinator

University of Bradford
Address
Richmond Road
BD7 1DP Bradford
United Kingdom

Participants (3)

Turing Institute Ltd
United Kingdom
Address
George House 36 North Hanover Street
G1 2AD Glasgow
UNIVERSIDAD DO PORTO
Portugal
Address
Rua Dr. Roberto Frias
4200 Porto
Université de Paris XI (Université Paris-Sud)
France
Address
Avenue Georges Clémenceau
91405 Orsay