Skip to main content

Vision as Process

Objective

The VAP Action aimed to demonstrate that the paradigm of "vision as process" is basic to the functioning of a high-level vision system. Such a hypothesis can only be demonstrated within the context of a complete vision system, in which the potential benefits of continuous control of perception and of associated temporal context are evident. The goal of this Action was to adapt and refine existing vision techniques and to integrate them as a first step towards a general-purpose vision system.
The integration of basic techniques for the construction of a continuously operating vision system capable of interpreting a dynamically changing environment was studied, concentrating on the control of perception via the goal directed focus of attention. The approach used spatial and temporal context, multiple resolutions and controlled motion of the sensor system.

A standard system architecture, which facilitates centralized and distributed control, has been suggested and implemented in a skeleton software system. The system includes a standard module architecture which can be replicated at all levels of the system. The software system provides extensive support for this module architecture, so that it permits easy integration of visual modules.

The integrated skeleton system includes new developments of a controllable stereo camera head (mounted on a robot arm), generation of multiresolution pyramids for image representation, line extraction, linking and tracking. These operations are performed by newly developed special VME hardware for operation at 10 Hz.

Visual modules for detailed image description and grouping, 3 dimensional description, geometric 3-dimensional modelling of the scene, scene interpretation and control of perception have been developed in a skeleton version and integrated into the system.

A final version of the system that is capable of doing continuous goal directed active vision and symbolic description of the content of a dynamic scene has been demonstrated. The functionality is still limited to a small application domain, but generic methods have to a large extent been applied and extensions may thus easily be incorporated.
APPROACH AND METHODS
Techniques to interpret a dynamically changing, quasi-structured environment were developed. These techniques used goal-directed focus-of-attention methods involving controlled sensor motion. Processing was directed by goals which change in response to the demands of the perceptual task, as well as in reaction to events in the scene. This approach is directed towards limiting the computational complexity of the perception process by restricting the size of the internal models employed. These models must be continuously updated to describe the environment in terms of a number of qualitatively different phenomena, such as image phenomena themselves, three-dimensional scene geometry, and symbolic interpretation of objects and events.
The necessary techniques were to be developed in the context of an integrated vision system, which will serve as a means of testing the fundamental hypothesis.
The research issues addressed include:
-the role of contexts and goals in the control of perception
-the use of multiple resolution representation of two- and three-dimensional shapes
-the description at multiple levels of abstraction.
PROGRESS AND RESULTS
After the third year of the Action, the construction phase concerning an initial prototype for a vision system has been completed:
-A standard system architecture, which facilitates centralised and distributed control, has been suggested and implemented in a skeleton software system. The system includes a standard module architecture which can be replicated at all levels of the syst em. The software system provides extensive support for this module architecture, so that it permits easy integration of visual modules contributed by the various partners of the Action.
-The integrated skeleton system includes new developments of a controllable stereo camera head (mounted on a robot arm), generation of multi resolution pyramids for image representation, line extraction, linking and tracking. These operations are perform ed by newly developed special VME hardware for operation at 10 Hz.
-Visual modules for detailed image description and grouping, three-dimensional description, geometric three-dimensional modelling of the scene, scene interpretation and control of perception have been developed in a "skeleton-version" and integrated into the system.
-A final version of the system that is capable of doing continuous goal-directed active vision and symbolic description of the content of a dynamic scene has been demonstrated. The functionality is still limited to a small application domain, but generic methods have to a large extent been applied and extensions may thus easily be incorporated.
POTENTIAL
This Action will contribute to closing the gap between current vision approaches and techniques, and to the development of a true general-purpose vision system. In particular, it should greatly enlarge the potential applications of machine vision, openingup new opportunities for pre-competitive research and industrial applications.
The research has already had industrial spin-offs and has lead to one industrial hardware board and several others are in the pipeline. Two controllable camera head are presently in the process of being commercialised.

Coordinator

AALBORG UNIVERSITET
Address
Fredrik Bajers Vej, 7
9220 Aalborg East
Denmark

Participants (4)

Institut National Polytechnique de Grenoble
France
Address
46 Avenue Félix Viallet
38031 Grenoble
Institut National Polytechnique de Grenoble
France
Address
46 Avenue Félix Viallet
38031 Grenoble
LINKOEPING UNIVERSITY
Sweden
Address
Valla
83 Linkoeping
ROYAL INSTITUTE OF TECHNOLOGY
Sweden
Address

100 44 Stockholm