Skip to main content
European Commission logo print header

Towards Total Scene Understanding using Structured Models

Objective

"This project is at the interface between computer vision and linguistics: the aim is to have an algorithm generate relevant sentences that describe a scene given one or more images.

Scene understanding has been one of the central goals in computer vision for many decades. It involves various individual tasks, such as object recognition, action understanding and 3D scene recovery. One simple definition of this task is to say scene understanding is equivalent to being able to generate meaningful natural language descriptions of a scene, an important problem in computational linguistics. Whilst even a child can do this with ease, the solution of this fundamental problem has remained elusive. This is because there has been a large amount of research in computer vision that is very deep, but not broad, leading to an in depth understanding of edge and feature detectors, tracking, camera calibration, projective geometry, segmentation, denoising, stereo methods, object detection etc. However, there has been only a limited amount of research on a framework for integrating these functional elements into a method for scene understanding.

Within this proposal I advocate a complete view of computer vision, in which the scene is dealt with as a whole, in which problems which are normally considered distinct by most researchers are unified into a common cost function or energy. I will discuss the form the energy should take and efficient algorithms for learning and inference. Our preliminary experiments indicate that such a unified treatment will lead to a paradigm shift in computer vision with a quantum leap in performance. We intend to build embodied demonstrators including a prosthetic vision aid to the visually impaired. The World Health Organization gives a figure of over 300 million such people world wide, which means that in addition to being transformative in the areas of linguistics, HCI, robotics, and computer vision, this work will have a massive impact world wide"

Call for proposal

ERC-2012-ADG_20120216
See other projects for this call

Host institution

THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
EU contribution
€ 2 493 495,00
Address
WELLINGTON SQUARE UNIVERSITY OFFICES
OX1 2JD Oxford
United Kingdom

See on map

Region
South East (England) Berkshire, Buckinghamshire and Oxfordshire Oxfordshire
Activity type
Higher or Secondary Education Establishments
Administrative Contact
Gill Wells (Ms.)
Principal investigator
Philip Hilaire Sean Torr (Prof.)
Links
Total cost
No data

Beneficiaries (1)