Projects - Documents, results & resources
ACTIPRET - Documents, results & resources
Project presentation
- Project description ( 55KB)
- Short Project presentation by Markus Vincze ( 244KB)
- Project partner details ( 55KB)
Demonstrations and presentations
A great overview of what ACTIPRET results could do for you is given in the IST Results news article "Flattening the learning curve for new technologies".
Demonstration of colour and hand tracking
Skin-coloured objects are detected with a Bayesian classifier which is bootstrapped with a small set of training data. On-line adaptation of skin-colour probabilities is used to enable the classifier to cope with illumination changes. Tracking over time is realized through a novel technique which can handle multiple skin-coloured objects. A prototype implementation of the developed system operates on 320x240 live video in real time (30Hz) on a conventional Pentium 4 processor. The proposed 2D tracker was extended to be able to report the 3D position of all skin-coloured regions in the field of view of a potentially moving stereoscopic camera system. The prototype implementation of the 3D version of the tracker also operates at 30 fps. On top of this functionality, the tracker is capable of delivering 3D contours of all skin-coloured regions; this is performed at a rate of 22 fps.
Applications demonstrated are human-computer interaction, e.g., to control the computer mouse from a web-cam on top of the screen based on finger detection in skin-coloured regions corresponding to human hands, and tracking colour blobs in vision-based robot navigation experiments. The techniques are exploited in the MultiSens project.
Web page: http://www.ics.forth.gr/~argyros/research/colortracking.htm
Demonstration of finger detection
We have developed a method for detecting the fingers in human hands. Hands are detected based on skin-coloured regions detection and tracking. Fingers are then detected based on multi-scale processing of blob-contours. Combined with the 3D tracking capabilities of the skin-colour detector and tracker, finger detection can give very useful information on the 3D position of fingertips. This type of information is particularly useful in the context of Cognitive vision systems such as the ActIPret demonstrator, whose goal is the interpretation of the activities of people handling tools. Moreover, finger detection can provide rich perceptual input to gesture recognition systems. We have already developed a system that permits a human to control the mouse of a computer. The developed demonstrator has successfully been employed in real-world situations where a human controls the computer during MS PowerPoint presentations.
Web page: http://www.ics.forth.gr/~argyros/research/fingerdetection.htm
Demonstration of wide baseline matching and object recognition
The problem of Wide-baseline matching arises when a set of images taken by an uncalibrated cameras from significantly different position and/or at different times have to be registered (in case of a 2D scene) or when the epipolar geometry of is sought (for 3D scenes). A software package solving the problem has been developed. The following service is available: a set of images of the same scene are sent to the Centre of Machine Perception and the registration information in terms of homographies (in 2D) or epipolar geometries (in 3D) are sent back. A login/password can be obtained by sending an e-mail
Web Page: http://nash.felk.cvut.cz/~wbsdemo/
The system recognises objects in still images. The objects can be of arbitrary shape. The method is highly robust to occlusion, background clutter, change of scale and orientation of the object. Object models are acquired automatically by machine learning methods. For flat objects, only a single training image is needed. The application potential is high especially for recognition of man-made objects, text in video, license plates, building recognition etc.
The BMVC 2002 paper that got "Best Science Paper" prize:
Matas, Chum, Urban, Pajdla: Robust Wide baseline Stereo from Maximally Stable Extremal Regions, British Machine Vision Conference, Bristol, 2002 (the paper is available on-line at http://waltz.felk.cvut.cz/~matas/papers/matas-bmvc02.pdf).
For the work on Object Recognition Ji?í Matas was awarded First Class Award for a Scientific Result by the Chancellor (rector) of the Czech Technical University in 2003.
The core code is available on-line, also see below at Tools.
Web page: http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html
Gesture recognition
Partner COGS has developed hand-gesture learning and recognition techniques to be used in advanced vision applications, such as the ActIPret system, for understanding the activities of expert operators for education and training. Radial Basis Function (RBF) networks have been developed for reactive vision tasks and work well, exhibiting fast learning and classification. This is exploitable in intelligent spaces or surveillance scenarios. It is presently exploited in a proposal including two industrial partners.
Active Control of Cognitive Multi-Robot Multi-Sensor System
Concept and implementation of a novel attentive/investigative control for active cognitive multi-sensor/multi-robot system. Control is based on a dynamic "shared responsibility" concept, that bases on (sub-) optimal integration of local decisions, featuring "intelligent" global behaviour - uncoupling components and allowing efficient use and extension due to local only relevant information. A simple version has been exploited in the FibreScope project (see http://www.profactor.at/232.0.html) and exploited with several companies in project MultiSens http://www.multisens.org/.
Software and tools
Open source Framework for Cognitive Vision Systems
During ActIPret a Framework for Cognitive Vision Systems was developed. The underlying software concept is founded on a component-based approach successful in software engineering projects, which is faster than classical agent systems to guarantee reactivity and enables easier distribution than conventional object oriented approaches whose scalability ends at the one computer border. It implements a rapid "Yellow Pages" dynamic look-up directory that makes it possible to select on-line the best available service according to a formalised performance characterisation. The resulting architecture is therefore dynamically adapted to optimally suit the current required tasks and the changing environment. The framework software is available from the authors here. It has been already exploited in the Austrian Cognitive Vision Project and is presently used by more than 10 institutions.
Also see an upcoming publication by W. Ponweiser, M. Vincze, M. Zillich: A Software Framework to Integrate Vision and Reasoning Components for Cognitive Vision Systems; Robotics and Autonomous Systems.
Automatic Camera Calibration using Automatic Ellipse Detection
The algorithm on ellipse detection makes use of an efficient grouping of arc segments based on tangent intersections. The exponentially large number of possible groupings of arcs is reduced to linear complexity. This renders the search for consistent groupings extremely fast and can be done in fractions of a second on conventional PCs. Consequently in realistic scenes ellipses can be detected within a fraction of a second and very reliably, as tests on a vast number of real-world imagery have demonstrated. As first exploitation, the ellipse detector has been used in a tool for automatic camera calibration, where the circular calibration markers can now be found without manual interaction. This is the calibration template.
Also see publications by M. Zillich, J. Matas: "Ellipse Detection using Efficient Grouping of Arc Segments", Workshop of the Austrian Association of Pattern Recognition ÖAGM/AAPR, pp.143-148, 2003 and Michael Zillich, E. Al-Ani: Camcalb: A user friendly camera calibration software; Workshop of the Austrian Association of Pattern Recognition ÖAGM/AAPR, pp.111-116, Hagenberg, 2004.
Tool for Model-based Object Tracking under Real World Conditions
V4R (Vision for Robotics) is a tool that enables the visual tracking of objects (e.g., lamp, books, rooms, structures) using geometrical features (lines, corners, ellipses, arcs). It uses a model-based approach and has been designed to operate in realistic conditions (e.g., on a shipyard, in-door environments with windows and changing lighting situations, or partially outside). To achieve the necessary robustness several model and image cues are integrated, evaluated and then fitted to give robust results. It has been extensively tested in several EU Project applications (RobVision, ActIPret) and different environments (shipyard, several different offices). It can be exploited in Augmented Reality applications to track objects, in robotics, automation and manufacturing, and service applications such as surveillance or air traffic control in airports.
Web page: http://www.acin.tuwien.ac.at/groups/robtec/V4r/index.htm
Maximally Stable Extremal Regions on-line
The essential component of the object recognition method developed in the Actipret project, the detector of the affinely covariant Maximally Stable Extremal Regions (MSERs) is available on-line.
Web page: http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html
Publications
- Task and Behaviour Learning for ActIPret Project ( 114KB) - Hilary Buxton
- Task-based (Cognitive) Control for ActIPret Project ( 87KB) - Hilary Buxton
- Generative models for learning and understanding dynamic scene activity ( 690KB) - Hilary Buxton
- Multi-class Support Vector Machine ( 73KB) - Vojtech Franc & Václav Hlavác
- Developing Task-Specific RBF Hand Gesture Recognition ( 671KB) - A. Jonathan Howell, Kingsley Sage & Hilary Buxton
- Developing Context Sensitive HMM Gesture Recognition ( 999KB) - Kingsley Sage, A. Jonathan Howell & Hilary Buxton
- Active Vision Techniques for Visually Mediated Interaction ( 48KB) - A. Jonathan Howell & Hilary Buxton
- Integration Frameworks for Large Scale Cognitive Vision Systems - an Evaluative study ( 92KB) - S. Wrede, C. Bauckhage, G. Sagerer, W. Ponweiser & M. Vincze
- Joint spatial and temporal structure learning for task based control ( 70KB)
- Edge Projected Integration of Image and Model Cues for Robust Model-Based Object Tracking ( 713KB) - M. Vincze, M. Ayromlou, W. Ponweiser & M. Zillich
- Learning and understanding dynamic scene activity: a review ( 326KB) - Hilary Buxton
- Robust Real-time Tracking of Ellipse Arcs ( 364KB) - J. Biber, M. Vincze, M. Ayromlou & W. Ponweiser
- Ellipse detection using efficient grouping of arc segments ( 477KB) - Michael Zillich & Ji?í Matas
- Camcalb: A User-friendly Camera Calibration Software ( 765KB) - Michael Zillich & Elaf Al-Ani
- The Role of Task Control and Context in Learning to Recognise Gesture ( 1.61MB) - Hilary Buxton, A. Jonathan Howell & Kingsley Sage
This page is maintained by: Björn Juretzki
