Servizio Comunitario di Informazione in materia di Ricerca e Sviluppo - CORDIS


CapReal Sintesi della relazione

Project ID: 335545
Finanziato nell'ambito di: FP7-IDEAS-ERC
Paese: Germany

Mid-Term Report Summary - CAPREAL (Performance Capture of the Real World in Motion)

CapReal develops the algorithmic foundations of the next generation of performance capture methods. The long term goal is to enable dynamic shape, motion and appearance reconstruction at previously unseen detail, in general scenes (also outdoors), and with only few cameras. In the project, we research foundational algorithmic questions on the boundary between computer vision and computer graphics. We made important progress in all core subtasks (work packages) researched in the project. The research team has steadily grown, research on all sub-projects has successfully progressed. In total, 36 peer reviewed papers were published in the reporting period, which includes several papers at the top computer graphics conferences (ACM SIGGRAPH (3), ACM SIGGRAPH Asia (6), EUROGRAPHICS (2); published in special journal issues of ACM TOG, CGF), the top vision and HCI conferences (CVPR (5), ICCV (4), ACM SIGHI (1)), several papers in the top graphics and vision journals (6), and an edited book on Digital Representations of the Real World with CRC Press.

In the following, we highlight a few milestone results, in particular outcomes of cross-disciplinary relevance and results benefiting from unconventional research approaches.
We fused and extended ideas from computer graphics and computer vision in a new way to enable new inverse rendering methods. They estimate much more detailed models of shape, illumination and reflectance from sparse imagery recorded in less controlled environments than previously possible. This enables us, in turn, to do shading-based refinement in general scenes at much higher detail than previously feasible, to estimate much more detailed appearance and illumination models in uncalibrated environments, and to use these extracted models to improve correspondence finding and 4D reconstruction in general scenes. In conjunction with new high performance non-linear solvers we developed, even dense real-time reconstruction and inverse rendering from stereo camera views or single camera views is, for the first time, feasible for certain types of scenes.

The project lead to new scene representation and 4D reconstruction algorithms that are scalable to dense scenes (many scene elements, difficult deformations, occlusions, apparent topology changes etc.). These enabled, for instance, one of the first methods for performance capture of closely interacting subjects. A new implicit formulation for analysis-by-synthesis reconstruction in less controlled scenes featuring a new visibility formulation analytically differentiable everywhere was also proposed.

The team further investigated new methods to learn and exploit scene priors (data-driven or physics-based) for improved 4D reconstruction, as well as user-guided intuitive interpretation and modification of captured scenes. As an example, we proposed new methods to estimate semantically meaningful deformation subspaces, as well as new approaches to design and learn lower-dimensional motion subspaces of arbitrary deforming shapes; both of them enable improved 4D reconstruction in less controlled scenes, and improved animation editing.
We further showed new ways of combining the aforementioned new generative reconstruction concepts with machine learning-based detection and classification methods for improved 4D reconstruction in less controlled environments.

Another important outcome of the project is the GVVPerfcapEva repository of data sets ( We make available a wide range of shape and performance capture data sets created in the CapReal project, and in collaborations with partner research groups at MPI for Informatics and other institutes. These data sets provide an opportunity to develop and evaluate new algorithms targeting different sub-fields of performance capture, such as general deformable shape capture, full body performance capture, facial performance capture, or performance capture of hand and finger motion.

Reported by