Periodic Reporting for period 1 - PRESENCE (A toolset for hyper-realistic and XR-based human-human and human-machine interactions)
Okres sprawozdawczy: 2024-01-01 do 2025-06-30
PRESENCE provides key technological contributions to reducing this gap, providing intuitive and realistic XR user experiences (UX), transcending the state-of-the-art in three (3) technological pillars: 1. Holoportation, 2. Haptics and 3. Virtual Humans. The next level of human-to-human and human-machine interaction in virtual worlds will be reached, in a way never witnessed before. Thanks to hyper-realistic holograms, remotely connected people will be able to see and hear each other in an immersive way, and also to feel each other, thanks to the integration of haptics, thus providing a strong illusion of (co-)presence. Intelligent virtual humans, such as realistic full body avatars (i.e. avatars generated or powered by artificial intelligence - AI) or fully autonomous intelligent virtual agents (IVAs), will ultimately enhance the user experience.
PRESENCE explores new solutions, improving the user experience in social and professional setups, changing, in general, the way we spend time with each other in virtual environments. The toolset of technologies developed will be integrated in two (2) demonstrators: 1) Professional XR Setups and 2) Social XR Setups specifically designed to validate the technological pillars in multiple aspects of our life. Both demonstrators will consider 2 use cases each: Professional Collaboration and Manufacturing Training in one side, Health and Cultural Heritage in the other.
In the holoportation domain, we have delivered a first version of a fully functional Holoportation SDK, which has already been integrated into three use case demonstrators. In terms of scalability our system can support more than six volumetric users per session in a medium-quality scenario and mid resolution user representations. We have introduced an innovative image-based geometry compression system, leveraging the efficiency and maturity of existing video codecs (e.g. H.264 and H.265) by mapping 3D geometric attributes into 2D image-like representations and achieving frame rates up to 30 fps with significantly reduced bitrates often below 50 Mbps for multicamera captured volumetric content. So far the system performs well with OTT RGBD cameras, while in the coming months our objective is to integrate a full-body reconstruction system based in light-field cameras, which will impose impose higher computational needs.
In the haptic domain, we have developed of a comprehensive content creation pipeline that enables the design, composition, and deployment of high-level haptic patterns using open APIs. This pipeline includes tools such as a haptic composer, and integration support for currently used game engines, enabling designers to target specific body parts and layer multiple effects. These systems have been implemented and tested within the project and are compatible with real-time XR scenarios. Preliminary support for multiple simultaneous users has been verified. The next phase will explore tactile data capturing methods, which will allow real-time generation and recording of haptic events to further enrich the input pipeline.In addition, there has been substantial progress in improving haptic engines and modality support. The unified haptics API enables four distinct haptic modalities: (1) simple vibrations (e.g. vibrotactile cues via gloves and vests), (2) textures (via high-frequency signal mapping), (3) stiffness (through kinaesthetic feedback), and (4) active feedback (such as contact actuation). These modalities have been successfully deployed across multiple haptic devices through the standardized MPEG .hjif format. The system is built for scalability and already supports real-time multi-user operation, with testing for ≥6 users and latency optimization planned in the next phase. This progress represents a meaningful advancement over the current state of the art and lays the foundation for synchronized, multi-modal tactile feedback in XR. Finally, a unified haptics API has been implemented. The system supports abstracted effect definition, remote triggering, and real-time playback, making it scalable across devices and scenarios.
In the virtual avatar domain, we have developed the IVH (Intelligent Virtual Human) Toolkit as a "set of virtual humans APIs." The toolkit has been successfully integrated and demonstrated in three use cases and incorprates four main key features. First, a character generation service, which allows content creators and developers to generate 3D humanoid models from a single frontal photo, addresses the intial requirements for an efficient 3D humanoid model generation pipeline. Second, a visual analyser, which understands session scenes, by performing real time action recognition of the participating virtual avatars. Third, a parametrized avatar creation tool for embodied anthropomorphic intelligent virtual agents (IVAs), customizing behavior through expressive facial expressions and body motions. Fourth, a smart avatar system that mirrors a user’s real-time expressions and movements onto a digital character, leveraging integrated facial and eye tracking. Fifth, multimodal human-IVA interaction components through speech, gaze, gestures, facial expressions, and full-body movements, using a combination of continuous low-level control and LLM-powered high-level decision-making.
Key needs to ensure further update and success of the project during second phase can be summariezed in 3 points:
- Further research: incorporate the new set of requirements into the final SDK versions and further explore the impact of the different user representation modalities into the FoP (Feeling of Presence)
- Demonstrations: increase the public and joint demonstration of project results in collaboration of SDK owners and UC partners.
- Comercialization: increase the marketing and communication efforts to discover further potential collaborations that could be addressed individually or through a joint exploitation strategy.