Community Research and Development Information Service - CORDIS

The Future of 3D-Audio by CARROUSO

A fundamental feature for the success of multimedia communication systems is the capability for reproducing virtual environments that are perceived by the user as natural as possible. True 3-D audio environments must be reproduced or synthetically generated in order to enrich the quality and naturalness of perceived audio. In order to achieve this goal ten European members from industry, research institutes and universities, active in advanced Audio, Multimedia and Telecommunications have joined their efforts in a project called CARROUSO. This name stands for "Creating, Assessing and Rendering in Real time Of high quality aUdio-viSual envirOnments in MPEG-4 context". The key objective of the project is to provide a novel technology that enables the transfer of a sound field, generated at a certain real or virtual space, to another space in combination with visual data. As mentioned before, a fully interactive control of relevant temporal, spatial and perceptual properties of the sound field will be provided.

The project is based on the synergy of two new and powerful technologies. The first one is related to the flexible standard MPEG-4, offering object-oriented coding and methods for scene description of 3D audio environments. The second one is based upon the revolutionary Wave Field Synthesis (WFS) rendering technique, which is able to produce a true sonic space, not its stereophonic representation.

To fulfil the objectives of the project, technical sub-systems are being specified for recording, encoding, transmitting, decoding and rendering of multi-channel sound.

The organization of the CARROUSO project is split into several work packages with defined tasks. These tasks address the definition of the global architecture including the definition of system parameters for audio and video data, the definition of functional blocks for the complete recording, transmission and rendering system, as well as the definition of final experimental system.

The goals of the recording system include first of all the recording of dry sounds of various sound sources. Microphone arrays in combination with camera systems are used to capture and determine the positions of signal sources. Analysis of interdependencies between microphone array software and algorithms for source tracking, de-reverberation and active echo cancellation is necessary to obtain an integration of the recording system blocks. Besides the recording of sound sources, room parameters are extracted from the scene by recording impulse responses and converted to a perceptual description. All these data obtained by the recording system can then be encoded, multiplexed and adapted for transmission in MPEG-4 context. By the reproduction system the audio signals of the separated sound sources have to be composed and processed to generate the sound impression of the recording room as well as perception of the sound source position. This requires at first, that the MPEG-4 compressed bitstreams are de-multiplexed and decoded in an MPEG-4 terminal with the possibility of user interaction in order to manipulate room acoustic characteristics different from the recording room. The generation of a 3D sound field is based on the wave-field synthesis (WFS) concept, which works with generation of wave-fronts using loudspeaker arrays. This technique enables an accurate representation of the original wave-field with its natural temporal and spatial properties in the entire listening space.

Related information

Reported by

Fraunhofer AEMT
Langewiesenerstr. 22
98693 Ilmenau
See on map
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top