Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Language Augmentation for Humanverse

Periodic Reporting for period 1 - LUMINOUS (Language Augmentation for Humanverse)

Reporting period: 2024-01-01 to 2025-06-30

The LUMINOUS project develops the next generation of XR systems that move beyond rigid scripts to deliver adaptive, natural, and context-aware experiences. Using large multimodal language models (MLLMs) that combine vision, speech, and text, the platform enables real-time recognition, situational understanding, and personalised guidance across healthcare, safety, rehabilitation, and design.
Objectives:
1. Achieve situational awareness through multimodal zero-shot learning.
2.Provide adaptive, personalised instruction.
3.Create realistic, culturally adaptable avatars guided by language.
4.Build an ethical framework ensuring safety, inclusivity, and EU compliance.
5. Validate the system in pilots for neurorehabilitation, safety training, and BIM design reviews.

LUMINOUS will deliver scalable, personalised XR with societal benefits, while opening new market opportunities. The project aims to set global standards for adaptive XR and lead in trustworthy AI for immersive technologies.
Work Done

Pilot 1 – Neurorehabilitation: Clinical requirements were defined and ethical approvals obtained, enabling the recording of 40+ rehabilitation sessions to build a dataset for language impairment analysis. Automated transcription and annotation pipelines were developed and optimized for CIU analysis. Two chatbot prototypes were created, a metacognition agent (already integrated into the MindFocus VR platform) and an aphasia agent, with iterative feedback gathered from clinicians. An evaluation methodology for the aphasia chatbot was embedded to support systematic assessment.
Pilot 2 – Health, Safety & Environment (HSE) Training: A complete fire-extinguisher VR prototype was built and validated, with full integration of the Voice Layer (STT/TTS) and a Unity-based context engine for LLM interaction. The system was refactored from rigid scripts into modular, event-driven dialogues supporting open-ended guidance. New UI components (transcripts, logs, error/fallback) improved robustness. An avatar was embedded and tested with placeholder speech, pending full lip-sync integration. An evaluation methodology and prompting strategies were also defined to assess gameplay following and response accuracy.
Pilot 3 – BIM & Architectural Design Review: Developed on-the-fly IFC loading by converting geometry via IfcOpenShell into glTF assets parsed in Unreal Engine 5, with metadata extracted separately for the LLM. Introduced a TCP/IP-based Python API as an intermediate layer between LLM and sandbox, and exposed external APIs (REST/JSONRPC/WebSocket) for TextToCAD, TextToMesh, STT, TTS, and headset/microphone data. Scene enrichment was enabled via overlay files without altering BIM sources, with strong IP protection ensuring original IFCs remain local. Text-to-code generation (EHU, VICOM) enabled LLM-driven sandbox edits, while Neo4j-based intermediate graphs allowed Cypher querying of BIM files. An evaluation methodology and scenarios were created to assess the quality of these outputs.

Achievements
· Established a clinical dataset with more than 40 rehabilitation sessions.
· CIU analysis pipeline fully operational.
· First chatbot prototypes functional, with one already integrated into VR.
· Expert validation of methodology and design initiated with clinical partners.
· Delivered an operational XR safety training demo with integrated LLM guidance.
· System validated for responsiveness and stability under varying latency conditions.
· Modular architecture now supports dynamic dialogues and help-on-demand.
· Avatar pipeline embedded, laying the groundwork for speech-driven animation.
· Achieved real-time loading and visualization of IFC files at runtime.
· Implemented graph generation of BIM metadata for the LLM knowledge base.
· Demonstrated voice-based commands for navigation and positioning.
· Enabled scene enrichment with additional props while preserving original BIM files.
Work in Progress:
· Navigation & platforms: Teleport-based VR navigation under development as an alternative to voice and Python control; Linux support is being expanded toward full feature parity.
· Content generation: Continuous improvements to Text-to-Mesh and Text-to-CAD workflows.
· Healthcare pilots: Ongoing refinement of metacognition and aphasia chatbots, with clinician feedback and evaluation methodologies integrated.
· Knowledge & reasoning: Experiments in knowledge injection (in-context learning, RAG) to improve grounding and common-sense reasoning in XR tasks.
· Avatars: Enhancements underway for dynamic gestures, improved lip-sync, and full-body motion synthesis using GAN-based approaches.
· Ethics: Iterative use of ALTAI and FRIA tools, with co-design workshops validating ethical norms.
· Datasets: Expansion of evaluation datasets across all pilots (rehabilitation, safety, BIM) to benchmark recognition, instruction following, and avatar realism.

Future Work:
· Avatars: Full integration of dynamic avatars into pilot systems, including speech-driven lip-sync and culturally adaptive gestures.
· Healthcare pilots: Expand clinical validation with larger patient cohorts, progressing toward the 200-session target.
· Safety pilots: Scale HSE training beyond fire-extinguisher scenarios to broader industrial tasks.
· BIM pilots: Advance BIM querying and editing through extended Neo4j/Cypher interfaces and improved text-to-code control pipelines.
· Exploitation: Develop commercialisation and IPR strategies for avatar rendering, text-to-3D pipelines, and secure BIM workflows.
My booklet 0 0