High-level Prior Models for Computer Vision

Project description

A seismic shift in computer vision

For years, computer vision has strived to match the extraordinary abilities of the human visual system, yet it has fallen short. With this in mind, the EU-funded HOMOVIS project will propose a visionary initiative that aims to bridge the gap between artificial vision and human-like perception. Specifically, it harnesses the power of a remarkable three-layer architecture, mirroring the efficiency of the human visual system. This architecture consists of a low-level layer, identifying critical image features, a mid-level layer, enabling disocclusion and boundary completion, and a high-level layer, responsible for object recognition. By integrating high-level priors into low-level variational models, HOMOVIS will introduce a unified mathematical framework. Its mathematical advancements will propel the field beyond conventional variational models.

Objective

Since more than 50 years, computer vision has been a very active research field but it is still far away from the abilities of the human visual system. This stunning performance of the human visual system can be mainly contributed to a highly efficient three-layer architecture: A low-level layer that sparsifies the visual information by detecting important image features such as image gradients, a mid-level layer that implements disocclusion and boundary completion processes and finally a high-level layer that is concerned with the recognition of objects.
Variational methods are certainly one of the most successful methods for low-level vision. However, it is very unlikely that these methods can be further improved without the integration of high-level prior models. Therefore, we propose a unified mathematical framework that allows for a natural integration of high-level priors into low-level variational models. In particular, we propose to represent images in a higher-dimensional space which is inspired by the architecture for the visual cortex. This space performs a decomposition of the image gradients into magnitude and direction and hence performs a lifting of the 2D image to a 3D space. This has several advantages: Firstly, the higher-dimensional embedding allows to implement mid-level tasks such as boundary completion and disocclusion processes in a very natural way. Secondly, the lifted space allows for an explicit access to the orientation and the magnitude of image gradients. In turn, distributions of gradient orientations – known to be highly effective for object detection – can be utilized as high-level priors. This inverts the bottom-up nature of object detectors and hence adds an efficient top-down process to low-level variational models.
The developed mathematical approaches will go significantly beyond traditional variational models for computer vision and hence will define a new state-of-the-art in the field.

Fields of science

Funding Scheme

ERC-STG - Starting Grant

Host institution

TECHNISCHE UNIVERSITAET GRAZ

Net EU contribution

€ 1 473 525,00

Address

RECHBAUERSTRASSE 12
8010 Graz
Austria

Region

Südösterreich Steiermark Graz

Activity type

Higher or Secondary Education Establishments

Links

Contact the organisation Website

Participation in EU R&I programmes

HORIZON collaboration network

Total cost

€ 1 473 525,00

Beneficiaries (1)

TECHNISCHE UNIVERSITAET GRAZ

Austria

Net EU contribution

€ 1 473 525,00

Project description

A seismic shift in computer vision

Objective

Fields of science

Programme(s)

Topic(s)

Call for proposal

Funding Scheme

Host institution

Beneficiaries (1)

Share this page

Download