Skip to main content

Holographic Vision for Immersive Tele-Robotic OperatioN

Periodic Reporting for period 1 - HoviTron (Holographic Vision for Immersive Tele-Robotic OperatioN)

Reporting period: 2020-06-01 to 2021-05-31

HoviTron aims at providing Holographic Vision (=Hovi) in Tele-Robotic OperatioN (=Tron) conditions, as shown in Figure 1. A typical application is the remote control by a tele-operator of a robot arm that manipulates objects in hazardous working environments. To this end, a couple of static cameras are set up around the scene of interest, while virtual viewpoints to the scene are synthesized to feed the Head Mounted Display (HMD) that the tele-operator wears. For each of his/her head positions, the corresponding images are synthesized (electronic pan-tilt) to feed the tele-operator’s HMD that provides holographic vision to the scene. This means that the tele-operator’s eyes focus and accommodate correctly to the object of interest he/she is staring at, ensuring visual comfort and less fatigue in already harsh working conditions.
Technically, the holographic HMD provided by Creal is made of a light field display projecting 32 images with micro-parallax to each eye, i.e. each image shows a slightly different perspective view compared to its adjacent ones, ensuring altogether holographic vision (cf. the prefix “Hovi” in HoviTron).
The system therefore exhibits two levels of parallax: one that is creating stereoscopic views by an advanced depth-based interpolation (aka virtual view synthesis) between camera views (at large inter-camera distance/baseline), and another one that is creating the 32 micro-parallax views within each eye. These two phases correspond respectively to the RaViS and STALF modules in Figure 2, which will be further explained in WP3 below. In practice, however, these two modules are intertwined into one-another, providing one large bulk of light field images in real-time to the HMD.
The RaViS/STALF view synthesis process requires a couple of RGBD images, where the depth D should comply to several constraints to ensure high-quality results. In particular (but not limited to), the depth image must be perfectly aligned with the RGB colour image, i.e. the depth discontinuities should perfectly follow the RGB object borders. We refer to it with “RGB-inline Depth” in Figure 2, which is an absolute condition for RaViS/STALF to work properly. Calibration in WP1 addresses this challenge partly.
Unfortunately, most (not to say all) depth sensing devices (Lidar, Kinect, etc) do not comply to this “RGB-inline Depth” constraint; strong countermeasures are then needed to overcome any virtual view synthesis artefact. Depth estimation techniques, on the contrary, where depth is calculated by matching images without active laser light projection can perfectly provide RGB-inline Depth images, but they typically require more input images and/or exhibit serious challenges in reaching real-time operating conditions. Both these cases represent the extreme endpoints of a large spectrum of depth sensing/estimation approaches that are far from straightforward. A plethora of candidate depth sensing/estimation devices considered in HoviTron are presented at the left side of Figure 2. We will briefly discuss them in the WP2 section below.
Noteworthy, the far-top and far-bottom devices in the left column of Figure 2, i.e. DERS and RayTrix, were chosen as a starting point in HoviTron, since they were both considered in the MPEG Immersive Video (MIV) standardization activities HoviTron is inspired from. DERS is to be applied on conventional cameras, while RayTrix is a representative of so-called plenoptic cameras that were anticipated to enter the MIV activities before project submission and did so halfway the first year of the HoviTron project.
RayTrix being real-time by design, it is logical to consider it as a potential candidate for HoviTron (despite its high cost resulting from RayTrix’ monopoly), while DERS developed within MIV has to be accelerated to reach real-time performances, cf. the more detailed discussion in the WP2 section below. In this tedious study with unexpected hurdles, many depth sensing/estimation alternatives popped up, some being abandoned, others probably still being viable solutions (though the component shortage in the semi-conductor industry resulting from the covid-19 pandemic may jeopardize this, cf. WP2 section). To mitigate the risks, two of them will be integrated in the Proof-of-Concept (PoC) of WP4, to be demonstrated in a robotic environment (cf. the suffix “Tron” in HoviTron).
Though we are confident that the RayTrix solution will reach top-quality results, we are nevertheless targeting a consumer/prosumer solution with RGB-inline Depth cameras of 500-1000€ each, which is more than an order of magnitude cheaper than RayTrix. We have currently reached satisfactory results with a Lidar & DERS acceleration (aka GoRG), showing promising quality-runtime-cost trade-offs. We nevertheless remain open to alternatives that might have the higher framerates commonly used in VR applications, i.e. 120 fps (or at least not less than 60 fps, though the Grant Agreement specifies only 15-30 fps). Such studies are part of the integration activities of WP4 that will get most of our attention in the second half of the project.
Finally, a cornerstone of the project is to verify that all the developed technology indeed reaches the holographic vision target (cf. the prefix “Hovi” in HoviTron). The user studies in WP5 have already shown that this goal is achieved when the depth images are perfect, i.e. obtained by raytracing from synthetic content. This validates the merits of the RaViS/STALF approach for holographic vision. The remaining challenge is to confirm these findings on the HoviTron PoC that will use a real scene, rather than a computer synthesized one. Indeed, depth artefacts will most probably occur, but to which degree and which countermeasures might be required is still an open question at this stage, halfway the project. Preliminary results bring a lot of optimism, but at the same time, we must stay with our feet on the ground and overcome all engineering challenges before conducting the conclusive user tests US-2.
HoviTron's core technologies
HoviTron's Vision