CORDIS - EU research results

Spatial 3D Semantic Understanding for Perception in the Wild

Project description

Revolutionising the realm of visual 3D perception

In today's digital age, understanding the 3D spatial semantics of our world is a paramount challenge. Real-world environments, rich in complexity, demand comprehension in their true three-dimensional context, even when observed through 2D images. Yet, achieving robust 3D semantic reasoning from visual data, like RGB or RGB-D observations, remains in its infancy, hindered by limited real-world 3D data and the intricate, high-dimensional nature of the problem. With this in mind, the ERC-funded SpatialSem project aims to harness the power of 3D perception, and lay the foundation for groundbreaking advancements in machine perception, immersive communication, mixed reality, as well as architectural and industrial modelling. Specifically, the project aims to shift the focus from image-based reasoning to a spatially-consistent, 3D representation.


Understanding the 3D spatial semantics of the world around us is core to visual perception and digitization -- real-world environments are spatially three-dimensional, and must be understood in its 3D context, even from 2D image observations. This will lead to spatially-grounded reasoning and higher-level perception of the world around us. Such 3D perception will provide the foundation for transformative, next-generation technology across machine perception, immersive communications, mixed reality, architectural or industrial modeling, and more. This will enable a new paradigm in semantic understanding that derives primarily from a spatially-consistent, 3D representation rather than relying on image-based reasoning that captures only projections of the world. However, 3D semantic reasoning from visual data such as RGB or RGB-D observations remains in its infancy, due to challenges in learning from limited amounts of real-world 3D data, and moreover, the complex, high-dimensional nature of the problem. In this proposal, we will develop new algorithmic approaches to effectively learn robust visual 3D perception, with new learning paradigms for features, representations, and operators, to encompass 3D semantic understanding.


Host institution

Net EU contribution
€ 1 500 000,00
Arcisstrasse 21
80333 Muenchen

See on map

Bayern Oberbayern München, Kreisfreie Stadt
Activity type
Higher or Secondary Education Establishments
Total cost
€ 1 500 000,00

Beneficiaries (1)