Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Closing the 4D Real World Reconstruction Loop

Periodic Reporting for period 4 - 4DRepLy (Closing the 4D Real World Reconstruction Loop)

Reporting period: 2023-03-01 to 2023-08-31

The project 4DRepLy develops entirely new approaches to capture models of the real world in motion, i.e. highly detailed models of geometry, motion, materials, illumination etc. from a single camera view. To this end, new methodologies are developed to combine learning-based and generative model-based reconstruction in previously unseen ways, such that they can be jointly trained and refined in a weakly supervised or unsupervised way on a continuous inflow of real-world data.

The foundational problems investigated in 4DRepLy pave the way for new possibilities for the use of visual computing technology that bring together computer graphics, computer vision and machine learning techniques in the real world in new ways. Society will benefit from these new possibilities in many ways. They pave the way for new means to culturally and creatively express ourselves as they enable new ways to more efficiently and at higher quality create computer graphics content. The new techniques also pave the way for new ways how we communicate with each other and how we naturally interact with intelligent computing systems and assistants of the future. They will empower greatly improved virtual and augmented reality scenarios, build foundations for new photo-real immersive telepresence systems, and enable new types of man-machine interaction approaches. Further, the insights gained in 4DRepLy will also lay the algorithmic foundations for advanced approaches of visual scene reconstruction and visual scene understanding, which is an essential precondition for future intelligent and autonomous systems that need to perceive and understand the human world in order to assist humans, and in order to safely act and interact with the human world. We also believe that the advanced capture approaches, notably human reconstruction methods, developed in 4DRepLy will benefit other domains of research, such as biomechanics, medicine, or cognitive science.
4DRepLy rethought fundamental concepts in visual computing and machine learning, and develops new ways of uniting them in the real world. The result profoundly advanced ways to capture, represent and synthesize the real world in motion at significantly improved efficiency, robustness and accuracy.

The project took unconventional methodical paths by investigating fundamentally new ways to combine machine learning-based and explicit model-based or expert-designed representations and algorithms. Here, we made important advancements on several fronts which are important building blocks of the overall research program, for instance: 1) adapting classical explicit representations such that they can be automatically adapted, combined and end-to-end trained with deep learning-based approaches; 2) advancing neural network-based approaches such that they can be combined with explicit models and are geared to learn more semantically plausible representations of scenes, as well as algorithms using these representations for reconstruction and synthesis; 3) developing foundational concepts for different degrees of integration of learning-based and explicit approaches for reconstruction and synthesis, ranging from approaches that exercise weak integration of the two, up to approaches that enable full end-to-end integration and training of both explicit and learning-based components; 4) new strategies to train and refine such integrated methods on a continuous inflow of unlabeled or weakly labeled real world observations.

Our fundamental rethinking of concepts in graphics, vision and machine learning benefitted from our unique strategy to deeply combine advanced forward models from graphics with concepts from vision and machine learning in the real world in entirely new end-to-end ways. Individual methodical aspects or our overarching goal were investigated in individual sub-projects published within 4DRepLy. The following are examples.

We presented groundbreaking new methods for dynamic face reconstruction, alongside learning of full parametric scene models, from weakly or unlabeled in the wild data (CVPR 2019, two papers at CVPR’21). We also presented entirely new means to capture and reconstruct human motion, dense deforming human surface geometry and appearance, from a single camera at state-of-the-art fidelity (ACM TOG’19, CVPR’20 Best Stud. Paper Hon. Mention). Further innovations were first-of-their-kind methods to do monocular real-time multi-person (SIGGRAPH’20, EG’23) motion capture and capture of humans in scene contexts (SIGGRAPH’20, ECCV’22), or to capture the shape and motion of two hands in close interaction from a single color (SIGGRAPH Asia’20) camera. The project further introduced pioneering new methods to reconstruct and photo-realistically render general static (NeurIPS’20, ECCV’22) and dynamic scenes (CVPR’21), as well as humans and human faces (SIGGRAPH’19, SIGGRAPH Asia’19, SIGGRAPH’20), even under new motion, directly from single- or multi-view video (SIGGRAPH’21, SIGGRAPH Asia’21, SCA’2023, NeurIPS’23); the basis are new ways to integrate explicit and neural implicit models. Further, the project presented pioneering generative models for 2D and 3D data with greatly enhanced disentanglement (CVPR’22), or real-time geometric controllability (SIGGRAPH’23).
As the examples above show, the insights gained in 4DRepLy enabled us to make pioneering contributions to a new field in visual computing, which is termed neural rendering. Neural rendering enables highly realistic synthesis of images and videos in a data-driven way, without having to resort to the established complex and time-consuming scene modeling and light-transport based rendering approaches. Our new ways to combine explicit model-based and learning-based approaches for image synthesis were instrumental here.

The many methodical insights gained in the project so far were disseminated in more than 100 publications (in the best conferences and journals in computer graphics, computer vision and machine learning) and technical reports. Further, many results from the project, e.g. on neural rendering or interactive control of image generative models, were reported on widely in general media outlets worldwide. The project also contributed widely used research code bases and datasets.
Overall, the scientific outcome of the project towards the overarching objectives was excellent. The previous section explained how the far-reaching insights and new methodologies laid the foundations for groundbreaking methodical advancements in visual reconstruction, modeling and synthesis approaches. Our results showed important improvements over the state of the art in terms of new methodologies, as well as in terms of performance, quality and robustness. The results of the project also built methodical foundations for the long-term overarching goal to build a new paradigm for jointly creating advanced new methods to represent, as well as to reconstruct and synthesize models of complex real-world scenes. In several sub-projects of 4D Reply we have shown important steps towards and domain specific first realizations of this loop. In the long term, the 4D Real World Reconstruction loop, will allow us to automatically and continuously train and refine new integrated model-based and learning based approaches for reconstruction and synthesis on a continuously increasing corpus of unlabeled or weakly labeled real world image and video observations.
draggan.png
luvizon-eg-2023.gif
nrtf.png
d3d.png
nonrigid-nerf.png
neuralactor.png
thumbnail.png
siga19-teaser.jpg
hulc.png
compl-face.png
rgb2hands.png
teaser.jpg
My booklet 0 0