Video Browsing Exploration and Structuring

This is an algorithm to retrieve objects or scenes in a movie with the ease, speed and accuracy with which Google retrieves web pages containing particular words. The query is specified by outlining an object of interest in an image (e.g. a frame of the video) and the system returns a ranked list of shots from the entire movie containing the object. Objects are retrieved despite viewpoint or scale variations and some amount of lighting changes. A demonstration is available at: http://www.robots.ox.ac.uk/~vgg/research/vgoogle/

Methods have been developed for 3D reconstruction and animation of human motion from monocular video sequences. These methods are based a) on 3D modelling of human body parts whose 3D motion is estimated from the video input and b) on the use of action specific key frames and postures that are recognized automatically. The methods have applications in the field of visualization of sports in order to enhance the experience of the viewer. They also potentially will allow for developing tools used for education and coaching as well as by the video game industry.

Methods have been developed that takes as input video sequences depicting a human action with a specific limited repertoire. Based on spatial and spatiotemporal data the sequence in segmented into frames or blocks of frames containing similar actions. An example of this is the segmentation of a sequence of tennis playing into elementary strokes. The methods will allow for automatic editing of long video sequences and extraction of user specified information. This and other applications are being pursued by the partners.

Understanding the content of images is a problem with many applications. Whether it is a robotic system that needs to identify the objects it is instructed to manipulate, a medical system that needs to localize organs or tissues, an image retrieval system designed to return images whose content match a query, or a surveillance system that is expected to identify intruders or watch when objects are moved - all those systems can benefit from a high-quality segmentation algorithm. Image segmentation methods divide the image into regions of coherent properties in an attempt to identify objects and their parts. Once an image is segmented, the tasks of recognition, compression, and information retrieval can be vastly simplified. In spite of many attempts, existing methods for segmentation fail to achieve satisfactory results when tested on a large variety of natural images. This is due to the vast complexity of images. Regions of interest may differ from surrounding regions by any of a variety of properties, and these differences can be observed in some, but often not in all scales. A further complication is that coarse measurements, for detecting these properties, cannot be obtained by simple geometric averaging, because they would often average over properties of neighbouring segments, making it difficult to identify the segments and to reliably detect their boundaries. Thus this processing, which is easily carried out by a human observer, is extremely sophisticated and non-trivial to achieve with computer algorithms.

Deliverables

Share this page

Download