Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Content archived on 2024-05-18

Creating, Assessing and Rendering in Real Time of High Quality Audio-Visual Environments in MPEG-4 Context

Article Category

Article available in the following languages:

Picking up good vibrations

In the beginning, there was mono-sound. Scratchy tunes carried over the airwaves by cathode powered radios that hummed and hissed more than anything else. Then there came stereo-sound. Dual source sound waves, that registered a whole lot better on the ears and lacked the background feedback so frustratingly annoying. From there we leapt to dolby-surround, quadraphonic and sense-surround in quick succession, that was, until now, the epitome of crystal clear sound quality. Now, we have CARROUSO.

CARROUSO is the energetic project of research institutes, universities, and industry involved in advanced audio, multimedia and telecommunications, brought together for one common purpose; to produce true 3-D interactive audio environments, capable of transferring unerringly clear sound environments from one location to another in conjunction with visual data. Creating, Assessing and Rendering in Real time Of high quality aUdio-viSual envirOnments (CARROUSO) uses two main hi-tech features to create a truly 3-D audio experience. The first of these is based on the flexible MPEG-4 software, which offers object-orientated coding as well as a means by which scene description of 3-D audio sound fields can be conveyed. The second synchronised technology is that of Wave Field Synthesis rendering with its outstanding capacity to produce a true sonic space not its stereophonic representation. Basically, CARROUSO is a recording system based on microphone arrays in conjunction with camera systems used to capture and determine signal source position and attempts to capture dry sounds of various sound sources. Additionally, room parameters are extracted and converted to perceptual descriptions. Elements such as de-reverberation, source tracking and active echo cancellation are all assessed through algorithms. All the collected data is then encoded, multiplexed and adapted for MPEG-4 transmission and true 3-D spatial sound is attained with wave-field synthesis and through the generation of wave-fronts using loudspeaker arrays. The end result is pure sound quality with natural temporal and spatial properties. In other words, sound as real, as clear, and as pleasurable as it can get.