Descrizione del progetto
Metodi computazionali innovativi per la descrizione dei suoni ambientali
I suoni di tutti i giorni possono fornire informazioni preziose sul nostro ambiente e sugli eventi che si verificano al suo interno. Tuttavia, la tecnologia attuale fatica a identificare le singole sorgenti sonore in paesaggi sonori complessi in cui sono presenti più suoni, distorti dall’ambiente circostante. Per affrontare questo problema, il progetto EVERYSOUND, finanziato dal Consiglio europeo della ricerca, mira a sviluppare metodi computazionali in grado di fornire automaticamente descrizioni di alto livello dei suoni ambientali. Il progetto utilizzerà tecniche innovative come la separazione congiunta delle sorgenti e algoritmi solidi di classificazione dei modelli per riconoscere in modo affidabile i suoni multipli che si sovrappongono. Inoltre, verrà sviluppata una tassonomia gerarchica multistrato per classificare accuratamente i suoni quotidiani. I risultati del progetto forniranno strumenti preziosi per studi geografici, sociali, culturali e biologici.
Obiettivo
Sounds carry a large amount of information about our everyday environment and physical events that take place in it. For example, when a car is passing by, one can perceive the approximate size and speed of the car. Sound can easily and unobtrusively be captured e.g. by mobile phones and transmitted further – for example, tens of hours of audio is uploaded to the internet every minute e.g. in the form of YouTube videos. However, today's technology is not able to recognize individual sound sources in realistic soundscapes, where multiple sounds are present, often simultaneously, and distorted by the environment.
The ground-breaking objective of EVERYSOUND is to develop computational methods which will automatically provide high-level descriptions of environmental sounds in realistic everyday soundscapes such as street, park, home, etc. This requires developing several novel methods, including joint source separation and robust pattern classification algorithms to reliably recognize multiple overlapping sounds, and a hierarchical multilayer taxonomy to accurately categorize everyday sounds. The methods are based on the applicant's internationally recognized and awarded expertise on source separation and robust pattern recognition in speech and music processing, which will allow now tackling the new and challenging research area of everyday sound recognition.
The results of EVERYSOUND will enable searching for multimedia based on its audio content, which is not possible with today's technology. It will allow mobile devices, robots, and intelligent monitoring systems to recognize activities in their environments using acoustic information. Producing automatically descriptions of vast quantities of audio will give new tools for geographical, social, cultural, and biological studies to analyze sounds related to human, animal, and natural activity in urban and rural areas, as well as multimedia in social networks.
Campo scientifico
- natural sciencescomputer and information sciencescomputational science
- natural sciencesbiological sciencesecologyecosystems
- natural sciencescomputer and information sciencesartificial intelligencemachine learningdeep learning
- natural sciencescomputer and information sciencesartificial intelligencepattern recognition
- natural sciencescomputer and information sciencesartificial intelligencecomputational intelligence
Programma(i)
Argomento(i)
Meccanismo di finanziamento
ERC-STG - Starting GrantIstituzione ospitante
33100 Tampere
Finlandia