Service Communautaire d'Information sur la Recherche et le Développement - CORDIS

FP5

ALERT Résumé de rapport

Project ID: IST-1999-10354
Financé au titre de: FP5-IST
Pays: France

Multilingual audio indexation system

The ALERT project aimed to associate state-of-the-art speech recognition with audio and video segmentation and automatic topic indexing to develop an automatic media monitoring demonstrator and evaluate it in the context of real world applications. The targeted languages were French, German and Portuguese.

The multilingual indexation system combines state-of-the-art automatic transcription capabilities for indexation of broadcast data in four languages: American English, French, German and Portuguese. The main components of the system are the audio partitioner and the speech recogniser, and the topic detector. All are based on statistical modelling techniques. Data partitioning is based on a language independent iterative maximum likelihood segmentation/clustering procedure using Gaussian mixture models and agglomerative clustering. The speech recogniser makes use of continuous density HMMs with Gaussian mixture for acoustic modelling and 4-gram statistics estimated on large text corpora. Word recognition is performed in multiple passes, where initial hypotheses are used for cluster-based acoustic model adaptation to improve word graph generation.

The spoken document retrieval demonstrator returns audio and/or video segments matching a typed natural language query. The extracts are selected from automatically derived transcriptions of shows. The demonstrator also displays the result of the partitioning process (speaker and acoustic condition labels), and the speech transcriptions synchronized with the audio signal.

More information on ALERT: http://alert.uni-duisburg.de/start.html

Informations connexes

Contact

Jean Luc GAUVAIN, (Director of research)
Tél.: +33-16-9858063
Fax: +33-16-9858088
E-mail