Periodic Reporting for period 1 - HoviTron (Holographic Vision for Immersive Tele-Robotic OperatioN)
Reporting period: 2020-06-01 to 2021-05-31
The system therefore exhibits two levels of parallax: one that is creating stereoscopic views by an advanced depth-based interpolation (aka virtual view synthesis) between camera views (at large inter-camera distance/baseline), and another one that is creating the 32 micro-parallax views within each eye. These two phases correspond respectively to the RaViS and STALF modules in Figure 2, which will be further explained in WP3 below. In practice, however, these two modules are intertwined into one-another, providing one large bulk of light field images in real-time to the HMD.
The RaViS/STALF view synthesis process requires a couple of RGBD images, where the depth D should comply to several constraints to ensure high-quality results. In particular (but not limited to), the depth image must be perfectly aligned with the RGB colour image, i.e. the depth discontinuities should perfectly follow the RGB object borders. We refer to it with “RGB-inline Depth” in Figure 2, which is an absolute condition for RaViS/STALF to work properly. Calibration in WP1 addresses this challenge partly.
Unfortunately, most (not to say all) depth sensing devices (Lidar, Kinect, etc) do not comply to this “RGB-inline Depth” constraint; strong countermeasures are then needed to overcome any virtual view synthesis artefact. Depth estimation techniques, on the contrary, where depth is calculated by matching images without active laser light projection can perfectly provide RGB-inline Depth images, but they typically require more input images and/or exhibit serious challenges in reaching real-time operating conditions. Both these cases represent the extreme endpoints of a large spectrum of depth sensing/estimation approaches that are far from straightforward. A plethora of candidate depth sensing/estimation devices considered in HoviTron are presented at the left side of Figure 2. We will briefly discuss them in the WP2 section below.
Noteworthy, the far-top and far-bottom devices in the left column of Figure 2, i.e. DERS and RayTrix, were chosen as a starting point in HoviTron, since they were both considered in the MPEG Immersive Video (MIV) standardization activities HoviTron is inspired from. DERS is to be applied on conventional cameras, while RayTrix is a representative of so-called plenoptic cameras that were anticipated to enter the MIV activities before project submission and did so halfway the first year of the HoviTron project.
RayTrix being real-time by design, it is logical to consider it as a potential candidate for HoviTron (despite its high cost resulting from RayTrix’ monopoly), while DERS developed within MIV has to be accelerated to reach real-time performances, cf. the more detailed discussion in the WP2 section below. In this tedious study with unexpected hurdles, many depth sensing/estimation alternatives popped up, some being abandoned, others probably still being viable solutions (though the component shortage in the semi-conductor industry resulting from the covid-19 pandemic may jeopardize this, cf. WP2 section). To mitigate the risks, two of them will be integrated in the Proof-of-Concept (PoC) of WP4, to be demonstrated in a robotic environment (cf. the suffix “Tron” in HoviTron).
Though we are confident that the RayTrix solution will reach top-quality results, we are nevertheless targeting a consumer/prosumer solution with RGB-inline Depth cameras of 500-1000€ each, which is more than an order of magnitude cheaper than RayTrix. We have currently reached satisfactory results with a Lidar & DERS acceleration (aka GoRG), showing promising quality-runtime-cost trade-offs. We nevertheless remain open to alternatives that might have the higher framerates commonly used in VR applications, i.e. 120 fps (or at least not less than 60 fps, though the Grant Agreement specifies only 15-30 fps). Such studies are part of the integration activities of WP4 that will get most of our attention in the second half of the project.