Servicio de Información Comunitario sobre Investigación y Desarrollo - CORDIS

Multilingual audio indexation system

The ALERT project aimed to associate state-of-the-art speech recognition with audio and video segmentation and automatic topic indexing to develop an automatic media monitoring demonstrator and evaluate it in the context of real world applications. The targeted languages were French, German and Portuguese.

The multilingual indexation system combines state-of-the-art automatic transcription capabilities for indexation of broadcast data in four languages: American English, French, German and Portuguese. The main components of the system are the audio partitioner and the speech recogniser, and the topic detector. All are based on statistical modelling techniques. Data partitioning is based on a language independent iterative maximum likelihood segmentation/clustering procedure using Gaussian mixture models and agglomerative clustering. The speech recogniser makes use of continuous density HMMs with Gaussian mixture for acoustic modelling and 4-gram statistics estimated on large text corpora. Word recognition is performed in multiple passes, where initial hypotheses are used for cluster-based acoustic model adaptation to improve word graph generation.

The spoken document retrieval demonstrator returns audio and/or video segments matching a typed natural language query. The extracts are selected from automatically derived transcriptions of shows. The demonstrator also displays the result of the partitioning process (speaker and acoustic condition labels), and the speech transcriptions synchronized with the audio signal.

More information on ALERT:

Información relacionada

Reported by

LIMSI - Laboratoire d'Informatique pour la Mecanique et les Sciences de l'Ingenieur - CNRS
LIMSI-CNRS, bat. 508
91403 Orsay
Síganos en: RSS Facebook Twitter YouTube Gestionado por la Oficina de Publicaciones de la UE Arriba