Skip to main content

A New Foundation for Computer Graphics with Inherent Uncertainty

Periodic Reporting for period 2 - FUNGRAPH (A New Foundation for Computer Graphics with Inherent Uncertainty)

Reporting period: 2020-04-01 to 2021-09-30

Three-dimensional (3D) Computer Graphics (CG) images are omnipresent; everyone is familiar with CG imagery in computer games or in film special effects. In recent years, CG images have become so realistic, it is hard to distinguish them from reality and even experts cannot tell if a scene in a film is live action or generated by a computer. CG is now used in many other domains, such as architecture, urban planning, product design, advertising or training and is the main technological component of Virtual and Augmented Reality.


However, creating digital assets for CG is extremely time-consuming, requiring literally hundreds of artists who painstakingly create the characters and the digital sets. The actual image generation using these assets – known with the technical term of rendering –involves complex and expensive computation. Recently, several techniques have been developed, using simple photos and videos, but also specialized sensors to capture 3D assets. However, traditional rendering techniques cannot handle this captured data because it is inaccurate, or in more technical terms, suffers from uncertainty, and it is very hard to manipulate since lighting and appearance are “frozen” to those at the time of capturing the photos or video.


The overall objective of FUNGRAPH is to address both the difficulty of creating assets and the complexity of rendering, by explicitly handling uncertainty in the data and in the rendering process for the first time. If we achieve our goals, creating, manipulating and rendering 3D assets will be come much more accessible, with far-reaching implications in all the application domains mentioned above. Simplifying the creation of such 3D assets and developing a unified solution for rendering captured and artist-created content will make the use of 3D data in such applications much easier. We hope that our results will play a central role in making 3D graphics more accessible, vastly broadening the usage of immersive 3D technologies in society. This will contribute towards the larger goal of making 3D as accessible as photos and video have become in the last few decades.


Achieving our objective requires us to provide a new foundation of CG rendering, with uncertainty playing a central role. We are developing new methodologies that identify uncertainty in the data and either correct it, or propagate the information to the rendering algorithms. This will allow users to more easily manipulate content and navigate in scenes created from captured data, and – in the medium/longer term – a mixture of captured and artist-created assets. We build heavily on modern machine learning techniques and contribute new methodologies based on training neural networks using artist-created data, and developing methods allowing trained neural networks to be used with real-world captured data. The new methodologies we develop will constitute a new Foundation for Computer Graphics rendering, based on a principled treatment of uncertainty, building largely on machine learning techniques.


The first 30 months of FUNGRAPH have revealed the utility of CG rendering for machine learning as well as the utility of machine learning for CG rendering. This mutual benefit has already proven to be extremely beneficial in the algorithms developed so far, and we are confident that it will have widespread impact in the future.
the future.
In the first half of the project, we have advanced significantly in our goals of processing and rendering uncertain data as well as on rendering algorithms themselves. We first investigated traditional rendering algorithms that take artist-generated data as input. Most CG images in film are nowadays rendered using an algorithm called Monte Carlo path tracing. Intuitively, this algorithm simulates the propagation of light from the light sources to the eye along a set of paths, bouncing off surfaces with different materials. In technical terms, glossy reflections refer to light reflected from materials that are shiny, but rougher than mirrors. Path tracing in this case is very expensive. Instead, we developed a method that pre-computes images of glossy reflections stored in regularly-places structures called probes, and at runtime we query the probes to find the correct reflected color of at a given pixel in the image [Rodriguez20a]. This query has a high level of uncertainty; our solution allows us to identify the correct information, and provides a filtering step that greatly improves image quality.

We have also studied the problem of estimating the appearance, or material properties of real objects, so they can subsequently be used in CG imagery, focusing on methods that use a small number of photos as input. This provides a very simple way to create interesting materials for CG assets, that otherwise requires significant effort from trained artists using complex professional software. We built on work that allows a neural network to estimate material properties from a single photograph of a small patch of a material. This requires the separation of a photograph into layers corresponding to different aspects of appearance: a matte “base texture", and separate layers explaining shiny appearance. To train a neural network we use artist-created assets that provide ground truth layers. Importantly, we use a rendering loss, i.e. we evaluate the accuracy of the estimation during training by rendering. The main novelty of our method is to combine multiple copies of the network allowing the use of several photos to improve the estimate [Deschaintre19]. In followup work we also provide artists control over captured materials at scale by allowing the combination of one large captured photo of an object with a set of specific material layers, either predefined or captured [Deschaintre20] (please see image 1).

Once materials are estimated, if accurate geometry is available users can change the lighting using traditional graphics rendering methods. An alternative to traditional rendering is Image-Based Rendering (IBR) which uses geometry reconstructed with Computer Vision techniques, to re-project the input photos into the novel view requested by a user. In this case changing the lighting is very hard. We developed a new approach that allows relighting of outdoor scenes while allowing free viewpoint navigation [Philip19]. One technical innovation is the use of a dual representation of geometry to train a neural network. We first render highly realistic images of artist-created scenes. We then run computer vision algorithms to perform 3D reconstruction using these images “as if they were photos”, giving two representations of the scene, one exact, “ground truth” created by artists, and one that is equivalent to what is obtained from real world photos with high levels of uncertainty. We use this dual representation of scenes to train a neural network to correct errors in the geometric reconstruction. With this representation and a novel approach using color to enhance shadow information, our neural relighting method allows users to change the time of day of a scene photographed with multiple photos, for example using a drone (see image 2). Again, rendering is used for the loss in training to estimate the quality of shadows. We are extending this work to indoors scenes, in the context of a novel neural renderer.

A final set of results involve novel solutions to IBR algorithms. Capturing and rendering cars with IBR is notoriously hard, since their shiny nature and their windows make them very hard to reconstruct for computer vision algorithms. We developed a novel solution that processes the uncertainty in this geometric reconstruction, corrects the car body geometry, estimates the window surface and separates the reflections from the transmitted light in the windows [Rodriguez20b]. This allows us to define a novel rendering algorithm allowing high quality free-viewpoint navigation in scenes with cars. Finally, we have developed a novel approach that estimates the uncertainty in the geometry and finds a good speed/quality tradeoff for IBR [Prakash21].

References

[Rodriguez20a] Simon Rodriguez, Thomas Leimkühler, Siddhant Prakash, Chris Wyman, Peter Shirley, George Drettakis, Glossy Probe Reprojection for Interactive Global Illumination, ACM Transactions on Graphics (SIGGRAPH Asia Conference Proceedings), Volume 39, Number 6, December 2020

[Deschaintre19] Valentin Deschaintre, Miika Aittala, Frédo Durand, George Drettakis, Adrien Bousseau, Flexible SVBRDF Capture with a Multi-Image Deep Network, Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering), Volume 38, Number 4, July 2019

[Deschaintre20] Valentin Deschaintre, George Drettakis, Adrien Bousseau, Guided Fine-Tuning for Large-Scale Material Transfer, Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering), Volume 39, Number 4, 2020

[Philip19] Julien Philip, Michaël Gharbi, Tinghui Zhou, Alexei Efros, George Drettakis, Multi-view Relighting Using a Geometry-Aware Network, ACM Transactions on Graphics (SIGGRAPH Conference Proceedings), Volume 38, Number 4, July 2019

[Nicolet20] Baptiste Nicolet, Julien Philip, George Drettakis, Repurposing a Relighting Network for Realistic Compositions of Captured Scenes, Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, May 2020

[Rodriguez20b] Simon Rodriguez, Siddhant Prakash, Peter Hedman, George Drettakis, Image-Based Rendering of Cars using Semantic Labels and Approximate Reflection Flow, Proceedings of the ACM on Computer Graphics and Interactive Techniques, Volume 3, Number 1, May 2020

[Prakash21] Siddhant Prakash, Thomas Leimkühler, Simon Rodriguez, George Drettakis, Hybrid Image-based Rendering for Free-view Synthesis, Proceedings of the ACM on Computer Graphics and Interactive Techniques, Volume 4, Number 1, May 2021
In the first half of FUNGRAPH, we have made significant progress in several domains: traditional rendering of artist-create assets, material estimation and relighting, and Image-Based Rendering.

Our glossy reflection rendering algorithm, by handling the uncertainty of reprojections allows interactive viewing of environments with complex materials and interactive navigation with full global illumination effects at unprecedented quality. Possible applications of such a technique could be in architectural/real-estate visualization, or product presentation, where rendering realistic materials is very important.

Our innovative methodologies that use synthetic training data and rendering-based loss functions have demonstrated their power. Our material estimation algorithms, which use only a few photos and build on synthetic data for training, have advanced the field significantly, inspiring followup work in the field. Our outdoors relighting solution is currently the only method capable of relighting multi-view datasets, such as those used in photogrammetry; this could have multiple applications expanding the utility of such data for 3D asset creation.

Similarly, our Image-Based Rendering algorithms, allow realistic rendering of notoriously difficult cases such as reflections on curved windows, enabling free-viewpoint navigation in scenes with cars for the first time. Evidently, this has significant repercussions on interactive 3D visualization of any cityscape, and can potentially be used as a way to generate training data for learning algorithms (e.g. for autonomous driving).

Work currently submitted also presents significant advances in the field of neural rendering and relighting and the use of Generative Adversarial Networks renderings in 3D environments.

In the next 30 months, we expect to advance significantly in the domain of neural rendering and relighting, which will allow us to bring together artist-created and capture input. These domains have seen an immense increase in interest in the last 18 months, and the field is moving forward rapidly. Our dual expertise in realistic rendering as well as image-based methods puts us in a favorable position to capitalize on these advances, helping us achieve the goals of FUNGRAPH. By the end of the project, we hope to have developed rendering methods that can handle heterogeneous data seamlessly, allowing casual users to easily capture a scene using photos or videos, and subsequently interactively navigate and edit these 3D scenes in a transparent manner, e.g. adding artist-created assets as required. The methodologies we have already developed and are currently advancing make us hopeful that we will make several significant breakthroughs in our field by the end of the project.
Image illustrating method of [Deschaintre20]; please see text "Work Performed"
Image illustrating method of [Philip21]; please see text "Work Performed