Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español es
CORDIS - Resultados de investigaciones de la UE
CORDIS
Contenido archivado el 2024-05-24

Combined Image and Word Spotting

Objetivo

This project aims to facilitate common procedures of archiving and retrieval of audio-visual material. The objective of the project is to develop and integrate a robust unrestricted keyword spotting algorithm and an efficient image spotting algorithm specially designed for digital audio-visual content, leading to the implementation and demonstration of a practical system for efficient retrieval in multimedia databases.

Specifically, a system will be developed to automatically retrieve images, video, an speech frames from an audio-visual database based on keywords entered by the used through keyboard or speech. Combined word and image spotting will be used and will provide an efficient mechanism enabling focused and precise searches with improved functianality and robustness. The CIMWOS system aims to become a valuable assistant in promoting the re-use of existing resources thus cutting down the budgets of new productions.

Work description:
Today, a vast amount of information is accumulated in the form of video, pictures, and audio, which does not lend itself to automated searching. To improve the usability of these invaluable resources indexing techniques are required, which are currently very expensive and time-consuming tasks mainly carried out manually by experts. In view of the expansion of the digital television and of video-based communications and related applications the need for an editor-like tool that allows the user to see/hear, select/modify and search over audio-visual databases becomes indispensable.

Although some European projects are addressing the issue of automated indexing of audio-visual material based on subtitles and speech recognition, the problem of locating important video clips based on their image contents has not been addressed. CIMWOS will use a dual audio and visual approach to locate important clips within multimedia material employing state-of-the-art algorithms for both image and speech recognition. Image processing algorithms will extract features to be used for pattern matching to recognise object classes. Continuous speech recognition algorithms will locate keywords in sound-clips and in the soundtracks of the video-clips, enabling more focused and precise searches. The search for object classes (e.g. face templates) and repeating patterns will be carried out off-line, making higher level descriptors available for the on-line search.

Similarly, automatic speech recognition will be performed off-line and an indexing mechanism will associate text fragments with audio fragments to refer to the audio contents of the database through their text transcription. Only text based retrieval algorithms will be involved on-line. Modern text retrieval algorithms will be enhanced and applied to ensure fast, efficient, and effective information retrieval from the multimedia database. CIMWOS will create and maintain a set of indexes to the multimedia contents with initial support for three European languages, namely English, French and Greek, while system design will ensure an open architecture for more languages to be added in the future. Users will be able to perform speech based, image based, and mixed searches on multiple criteria for text-based retrieval, based on the automatically generated annotations. The results of the searches will first be transmitted in a compacted "preview" format before downloading the actual content enabling users to determine which information will be actually retrieved.

Milestones:
The CIMWOS system will be a powerful tool in the hands of the world of media and television, video, news broadcasting, show business, advertisement, and any organisation that produces, markets and/or broadcasts video and audio programmes, facilitating common procedures of retrieving audio-visual material during a research, a production of a documentary, etc.

Utilising the vast amounts of information accumulated in audio and video, the CIMWOS system will become an invaluable assistant in promoting the re-use of existing resources and cutting down the budgets for new productions.

Ámbito científico (EuroSciVoc)

CORDIS clasifica los proyectos con EuroSciVoc, una taxonomía plurilingüe de ámbitos científicos, mediante un proceso semiautomático basado en técnicas de procesamiento del lenguaje natural. Véas: El vocabulario científico europeo..

Para utilizar esta función, debe iniciar sesión o registrarse

Programa(s)

Programas de financiación plurianuales que definen las prioridades de la UE en materia de investigación e innovación.

Tema(s)

Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.

Convocatoria de propuestas

Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.

Datos no disponibles

Régimen de financiación

Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.

CSC - Cost-sharing contracts

Coordinador

INSTITUTE FOR LANGUAGE AND SPEECH PROCESSING
Aportación de la UE
Sin datos
Dirección
EPIDAVROU & ARTEMIDOS 6
15125 MAROUSSI - ATHENS
Grecia

Ver en el mapa

Coste total

Los costes totales en que ha incurrido esta organización para participar en el proyecto, incluidos los costes directos e indirectos. Este importe es un subconjunto del presupuesto total del proyecto.

Sin datos

Participantes (5)

Mi folleto 0 0