European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

Scan2CAD: Learning to Digitize the Real World

Periodic Reporting for period 3 - Scan2CAD (Scan2CAD: Learning to Digitize the Real World)

Okres sprawozdawczy: 2022-01-01 do 2023-06-30

We address the issue of digitization of real-world content; i.e. learning how to automatically create CAD-like content with AI. We propose several novel methods to solve the problem, ranging from 3D CAD model retrieval from a shape database to high-quality geometric scans using a color and depth data from a commodity camera sensor. We also develop methods for scene understanding (e.g. object detection using text as input query), geometry creation or novel view synthesis of some target environment. Our current focus is on indoor environments, but ultimately, we want our solutions to generalize to any type of environment.

The use of technology has seen a rapid increase in the past several decades, with more and more of our daily activities relying on digital content. Digitization of real world object would change the way we interact with the world. It would allow us to get a realistic look of our new apartment, let us see what new furniture looks like in our homes before buying it, make digital conferences feel closer to real ones, significantly increase the speed at which we can generate 3D content for movies or video games etc.

The overall objectives of our project are:
- learn how humans model 3D content
- create a photo-realistic replica of the real world
- allow computers to perceive the real world the same way humans do (e.g. object detection or classification)
- the creation of holograms as the step beyond virtual/augmented reality
So far we have developed the following algorithms in the scope of the Scan2CAD project:
- Automatic retrieval and alignment of CAD models to real-world objects using a shape database
- Neural 3D scene reconstruction with a database
- 3D Semantic Instance Segmentation of RGB-D Scans
- 3D Mesh creation from unstructured range scans
- Context-aware captioning of RGB-D scans
We set a new state-of-the-art in terms of CAD alignments on top of 3D scans; significantly improved underlying neural networks that can automatically make these predictions; new representations for scene representations; reinforcement learning for 3D modeling with self-supervised trained agents. We show in our publications that compared to previous work, we get improvements in terms of accuracy, robustness, runtime etc. For the remainder of projects, we are working to solve other challenging problems in the field of 3D scene reconstruction and 3D scene understanding.
reinforment learning for modeling teaser image