European Commission logo
English English
CORDIS - EU research results
CORDIS

SONICOM - Transforming auditory-based social interaction and communication in AR/VR

Periodic Reporting for period 1 - SONICOM (SONICOM - Transforming auditory-based social interaction and communication in AR/VR)

Reporting period: 2021-01-01 to 2021-12-31

Immersive audio is our everyday experience of being able to hear and interact with sounds around us. Simulating spatially located sounds in virtual or augmented reality (VR/AR) must be done in a unique way for each individual. This has been the object of extensive research in the past years; nevertheless, several major challenges are still to be tackled in this area, which require an improved understanding and modelling of the human spatial hearing mechanisms. Furthermore, the impact of immersive audio beyond perceptual metrics such as presence and localisation is still an unexplored area of research, specifically when related with social interaction within virtual environments, entering the behavioural and cognitive realms.

SONICOM puts together some of the most important research centres across Europe to tackle these research challenges using a multidisciplinary approach. Research involves a mixture of digital signal processing and artificial intelligence, together with psychology and perceptual modelling; exploring new forms, levels, and dimensions of interaction within auditory-based immersive environments. Furthermore, in order to reinforce the idea of reproducible research and promoting future development and innovation in the area of auditory-based social interaction, the SONICOM Ecosystem will be created and shared with the wider community, which will include auditory data closely linked with model implementations and immersive audio rendering components.
The SONICOM project officially started in January 2021, when most of Europe was in lockdown. The first six months of the project focussed on recruitment and organisational matters, as well as on early research work within WP1 (Immersion). But the effective commencement of the SONICOM research activities only started in July 2021, after the online kick-off meeting and after the first operational WP meetings.

In the following 6 months, up to the end of the first year, all the planned research and management/organisation activities within SONICOM started and progressed smoothly. The ‘core’ immersive audio research within WP1 included:
• Morphology-based HRTF modelling and individualisation
• Early work on HRTF selection and related modelling framework, as well as initial pilots on HRTF and speech perception studies
• Pilot studies and planning of the real/virtual matching studies within AR/VR contexts, specifically looking at reverberation both from computational and perceptual perspectives
• Design and installation of an HRTF measurement setup at Imperial

Most of the work listed above has been presented at international conference venues and/or has been published in refereed journals. Notable publications are the paper from the Sorbonne team in Paris on the sensitivity analysis of their parametric pinna model (https://doi.org/10.1121/10.0004128) and the paper from the Acoustic Research Institute in Vienna on modelling active localisation using Bayesian inference (https://doi.org/10.1051/aacus/2021039). Another notable dissemination activity related with WP1 is the organisation of a special session titled Personalisation of Binaural Audio in Virtual and Augmented Reality, which included 10 presented studies at the I3DA 2021 International Conference (https://www.i3da.eu).

The research aims and activities of WP2 (Interaction) are relatively new to most of the academic partners within the consortium, and for this reason the work in WP2 has moved forward more slowly. These include:
• Overall planning and mapping of future WP2 activities – this work resulted in the submission of the first research deliverable within SONICOM
• Design of the SONICOM 3D Speaker Personality Corpus, which will allow us to investigate the interplay connecting the distance between a speaker and a listener and the traits that the latter attributes to the former
• Design of additional studies looking at the impact of immersive audio rendering parameters/choices on communication and 6DoF interactions in AR/VR

Related to the planned works on reaching and impacting researchers and communities beyond the SONICOM consortium, the commencement of WP5 (Beyond) has been brought forward to the beginning of the project. This was prompted by the opportunity for SONICOM to directly contribute to the scheduled release of toolboxes (e.g. AMT - https://amtoolbox.org/) and updates of spatial audio standards (e.g. SOFA - https://www.sofaconventions.org) and tools (Mesh2HRTF - https://mesh2hrtf.sourceforge.io/).
Finally, as part of the cross-collaboration initiative with 3 other research projects funded within the same call, SONICOM has engaged with the EXPERIENCE, CAROUSEL and TOUCHLESS projects and set the grounds for future collaborations on common research interests.
During its first year, SONICOM has already substantially contributed to the design and evaluation of novel immersive audio technologies and techniques, as well as to the overall advancement of knowledge and understanding in the spatial acoustics and immersive audio research domains. Our works on parametric pinna models will move forward aiming at consolidating practices across the consortium and starting perceptual validation studies. Similarly, work on perceptual- and numerical-based HRTF selection frameworks has already started mapping existing approaches across the research community and will contribute to consolidating their use within and beyond SONICOM. Explorative work has been planned and piloted looking at the interactions between HRTF choices and speech intelligibility; this is a relatively unexplored research area which has the potential to generate major impact on our understanding of immersive audio communication and interaction within AR/VR. Considering specifically the AR context, where real and virtual combine in one single domain, SONICOM’s work focussed on the design and piloting of two studies looking at matching the reverberation of the real environment with the virtual one, and at exploring which computational/rendering choices and parameters are relevant from a perceptual point of view. Results from these and other experiments will allow us to achieve a seamless blend between the real and virtual auditory domains.

Significant work has also been done in the first year to map and plan activities aiming at exploring and modelling how the physical characteristics of spatialised auditory stimuli can influence observable behavioural, physiological, kinematic, and psychophysical reactions of listeners within AR/VR-based social interaction scenarios. When released, the SONICOM 3D Speaker Personality Corpus will hopefully become a standard tool within the academic community for exploring interactions between the simulated distance/position of speakers and the traits that listeners attribute to them. Furthermore, it will be used within the project to produce datasets and train AI-based models aiming at inferring socially and psychologically relevant perceptions of a user from the widest possible array of measurable aspects in an AR/VR environment.

The impact of the work carried out so far has already been evidenced by the contribution of SONICOM in the release and update of tools and datasets widely used within the immersive audio research community, and plans for more substantial impact will involve the mid-project sandpit event and the Listener Acoustics Personalisation challenge.
SONICOM logo