Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-06-18

Towards Total Scene Understanding using Structured Models

Objective

"This project is at the interface between computer vision and linguistics: the aim is to have an algorithm generate relevant sentences that describe a scene given one or more images.

Scene understanding has been one of the central goals in computer vision for many decades. It involves various individual tasks, such as object recognition, action understanding and 3D scene recovery. One simple definition of this task is to say scene understanding is equivalent to being able to generate meaningful natural language descriptions of a scene, an important problem in computational linguistics. Whilst even a child can do this with ease, the solution of this fundamental problem has remained elusive. This is because there has been a large amount of research in computer vision that is very deep, but not broad, leading to an in depth understanding of edge and feature detectors, tracking, camera calibration, projective geometry, segmentation, denoising, stereo methods, object detection etc. However, there has been only a limited amount of research on a framework for integrating these functional elements into a method for scene understanding.

Within this proposal I advocate a complete view of computer vision, in which the scene is dealt with as a whole, in which problems which are normally considered distinct by most researchers are unified into a common cost function or energy. I will discuss the form the energy should take and efficient algorithms for learning and inference. Our preliminary experiments indicate that such a unified treatment will lead to a paradigm shift in computer vision with a quantum leap in performance. We intend to build embodied demonstrators including a prosthetic vision aid to the visually impaired. The World Health Organization gives a figure of over 300 million such people world wide, which means that in addition to being transformative in the areas of linguistics, HCI, robotics, and computer vision, this work will have a massive impact world wide"

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: https://op.europa.eu/en/web/eu-vocabularies/euroscivoc.

You need to log in or register to use this function

Call for proposal

ERC-2012-ADG_20120216
See other projects for this call

Host institution

THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
EU contribution
€ 2 493 495,00
Address
WELLINGTON SQUARE UNIVERSITY OFFICES
OX1 2JD Oxford
United Kingdom

See on map

Region
South East (England) Berkshire, Buckinghamshire and Oxfordshire Oxfordshire
Activity type
Higher or Secondary Education Establishments
Links
Total cost
No data

Beneficiaries (1)

My booklet 0 0