Periodic Reporting for period 3 - SONICOM (SONICOM - Transforming auditory-based social interaction and communication in AR/VR)
Periodo di rendicontazione: 2023-07-01 al 2024-12-31
SONICOM puts together some of the most important research centres across Europe to tackle these research challenges by using a multidisciplinary approach. Research involves a mixture of digital signal processing and artificial intelligence, with psychology and perceptual modelling; exploring new forms, levels, and dimensions of interaction within auditory-based immersive environments. Furthermore, in order to reinforce the idea of reproducible research and promoting future development and innovation in the area of auditory-based social interaction, the SONICOM Ecosystem will be created and shared with the wider community, which will include auditory data closely linked with model implementations and immersive audio rendering components.
In the last 10 months, WP1 and WP2 were at the core of SONICOM’s research. Several studies have been carried out and published within the 3 tasks of WP1; achieving significant results both in terms of general understanding of spatial hearing mechanisms (specifically focussed on the matters of personalisation and reverb perception), and in terms of developed methods and tools for virtual spatial audio rendering, from the measurement and synthesis of individual Head Related Transfer Functions (HRTFs), to their estimation using parametric pinna models and/or photogrammetry; from the rendering of directional sound sources features, to the matching of real reverberation within the virtual domain; and so on.
Similarly, several studies have been carried out in WP2, this time looking at higher-level perceptual processes, closer to the cognitive side of things. Beyond the flagship study on proxemics, which produced very interesting results on the relationship between the distance of a speaker and his/her perceived personality traits, other experiments have been designed and carried out for example, looking at interactions in complex simulated environments, as well as in AR scenarios; or exploring in depth the process of adaptation to non-individual HRTFs and how much the acquired training can be generalised to other filters, tasks and cues; or again looking at (and modelling) dynamic localisation, both in 3 and 6 Degrees of Freedom.
WP3 was also an essential part of RP3, having the task to take what was discovered and developed in WP1-2, implement it into a usable framework (the Binaural Rendering Toolkit - BRT), and deliver it to WP4 for the evaluations, and to WP5 for ensuring that all the tools and models will keep existing and, potentially, being developed after the end of SONICOM. Several implementations and versions of the BRT were released within the SONICOM consortium; several meetings and workshops took place to clarify requirements coming from the planned WP4 evaluations, and to ultimately train researchers in using such custom tools. The build of the self-personalising headphones also moved forward, and we now have a functional prototype, which has been successfully deployed within experimental settings.
WP4 effectively started within this reporting period; working towards the definition of the scenarios and significantly involving the SME partners in the design of the evaluations. We are now ready to start working on the implementation (this has already happened for some of the tasks), aiming at starting to carry out the actual experiments and installations in 2025.
WP5 was also very active. In addition to the release of several data, models, and tools through various existing channels, the SONICOM ecosystem design has now been completed and we are working on its implementation and, ultimately, its release. In addition to this, within RP3 we have launched and completed the first Listener Acoustic Personalisation (LAP) challenge (https://www.sonicom.eu/lap-challenge/(si apre in una nuova finestra)). This was a big success for SONICOM, as several EU and overseas participants were involved, and both the launch and the closing events were hosted within high-impact conference venues.
In addition, significant work has been completed mapping, planning, and executing activities aimed at exploring and modelling how the physical characteristics of spatialised auditory stimuli can influence observable behavioural, physiological, kinematic, and psychophysical reactions of listeners within AR/VR-based social interaction scenarios. The SONICOM 3D Speaker Personality Corpus will be soon released, and will hopefully become a standard tool within the academic community for exploring interactions between the simulated distance/position of speakers and the traits that listeners attribute to them. Furthermore, our study on proxemics has now shown significant results, which will be soon published showing how AI-based models can be trained to infer socially and psychologically relevant perceptions of a user from the widest possible array of measurable aspects in an AR/VR environment.