Objective
"This project is at the interface between computer vision and linguistics: the aim is to have an algorithm generate relevant sentences that describe a scene given one or more images.
Scene understanding has been one of the central goals in computer vision for many decades. It involves various individual tasks, such as object recognition, action understanding and 3D scene recovery. One simple definition of this task is to say scene understanding is equivalent to being able to generate meaningful natural language descriptions of a scene, an important problem in computational linguistics. Whilst even a child can do this with ease, the solution of this fundamental problem has remained elusive. This is because there has been a large amount of research in computer vision that is very deep, but not broad, leading to an in depth understanding of edge and feature detectors, tracking, camera calibration, projective geometry, segmentation, denoising, stereo methods, object detection etc. However, there has been only a limited amount of research on a framework for integrating these functional elements into a method for scene understanding.
Within this proposal I advocate a complete view of computer vision, in which the scene is dealt with as a whole, in which problems which are normally considered distinct by most researchers are unified into a common cost function or energy. I will discuss the form the energy should take and efficient algorithms for learning and inference. Our preliminary experiments indicate that such a unified treatment will lead to a paradigm shift in computer vision with a quantum leap in performance. We intend to build embodied demonstrators including a prosthetic vision aid to the visually impaired. The World Health Organization gives a figure of over 300 million such people world wide, which means that in addition to being transformative in the areas of linguistics, HCI, robotics, and computer vision, this work will have a massive impact world wide"
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: https://op.europa.eu/en/web/eu-vocabularies/euroscivoc.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: https://op.europa.eu/en/web/eu-vocabularies/euroscivoc.
- humanitieslanguages and literaturelinguistics
- engineering and technologyelectrical engineering, electronic engineering, information engineeringelectronic engineeringsensorsoptical sensors
- natural sciencescomputer and information sciencesartificial intelligencecomputer vision
- natural sciencesmathematicspure mathematicsgeometry
- engineering and technologyelectrical engineering, electronic engineering, information engineeringelectronic engineeringrobotics
You need to log in or register to use this function
We are sorry... an unexpected error occurred during execution.
You need to be authenticated. Your session might have expired.
Thank you for your feedback. You will soon receive an email to confirm the submission. If you have selected to be notified about the reporting status, you will also be contacted when the reporting status will change.
Call for proposal
ERC-2012-ADG_20120216
See other projects for this call
Funding Scheme
ERC-AG - ERC Advanced GrantHost institution
OX1 2JD Oxford
United Kingdom