Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano it
CORDIS - Risultati della ricerca dell’UE
CORDIS
Contenuto archiviato il 2024-05-24

Combined Image and Word Spotting

Obiettivo

This project aims to facilitate common procedures of archiving and retrieval of audio-visual material. The objective of the project is to develop and integrate a robust unrestricted keyword spotting algorithm and an efficient image spotting algorithm specially designed for digital audio-visual content, leading to the implementation and demonstration of a practical system for efficient retrieval in multimedia databases.

Specifically, a system will be developed to automatically retrieve images, video, an speech frames from an audio-visual database based on keywords entered by the used through keyboard or speech. Combined word and image spotting will be used and will provide an efficient mechanism enabling focused and precise searches with improved functianality and robustness. The CIMWOS system aims to become a valuable assistant in promoting the re-use of existing resources thus cutting down the budgets of new productions.

Work description:
Today, a vast amount of information is accumulated in the form of video, pictures, and audio, which does not lend itself to automated searching. To improve the usability of these invaluable resources indexing techniques are required, which are currently very expensive and time-consuming tasks mainly carried out manually by experts. In view of the expansion of the digital television and of video-based communications and related applications the need for an editor-like tool that allows the user to see/hear, select/modify and search over audio-visual databases becomes indispensable.

Although some European projects are addressing the issue of automated indexing of audio-visual material based on subtitles and speech recognition, the problem of locating important video clips based on their image contents has not been addressed. CIMWOS will use a dual audio and visual approach to locate important clips within multimedia material employing state-of-the-art algorithms for both image and speech recognition. Image processing algorithms will extract features to be used for pattern matching to recognise object classes. Continuous speech recognition algorithms will locate keywords in sound-clips and in the soundtracks of the video-clips, enabling more focused and precise searches. The search for object classes (e.g. face templates) and repeating patterns will be carried out off-line, making higher level descriptors available for the on-line search.

Similarly, automatic speech recognition will be performed off-line and an indexing mechanism will associate text fragments with audio fragments to refer to the audio contents of the database through their text transcription. Only text based retrieval algorithms will be involved on-line. Modern text retrieval algorithms will be enhanced and applied to ensure fast, efficient, and effective information retrieval from the multimedia database. CIMWOS will create and maintain a set of indexes to the multimedia contents with initial support for three European languages, namely English, French and Greek, while system design will ensure an open architecture for more languages to be added in the future. Users will be able to perform speech based, image based, and mixed searches on multiple criteria for text-based retrieval, based on the automatically generated annotations. The results of the searches will first be transmitted in a compacted "preview" format before downloading the actual content enabling users to determine which information will be actually retrieved.

Milestones:
The CIMWOS system will be a powerful tool in the hands of the world of media and television, video, news broadcasting, show business, advertisement, and any organisation that produces, markets and/or broadcasts video and audio programmes, facilitating common procedures of retrieving audio-visual material during a research, a production of a documentary, etc.

Utilising the vast amounts of information accumulated in audio and video, the CIMWOS system will become an invaluable assistant in promoting the re-use of existing resources and cutting down the budgets for new productions.

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

È necessario effettuare l’accesso o registrarsi per utilizzare questa funzione

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Dati non disponibili

Meccanismo di finanziamento

Meccanismo di finanziamento (o «Tipo di azione») all’interno di un programma con caratteristiche comuni. Specifica: l’ambito di ciò che viene finanziato; il tasso di rimborso; i criteri di valutazione specifici per qualificarsi per il finanziamento; l’uso di forme semplificate di costi come gli importi forfettari.

CSC - Cost-sharing contracts

Coordinatore

INSTITUTE FOR LANGUAGE AND SPEECH PROCESSING
Contributo UE
Nessun dato
Indirizzo
EPIDAVROU & ARTEMIDOS 6
15125 MAROUSSI - ATHENS
Grecia

Mostra sulla mappa

Costo totale

I costi totali sostenuti dall’organizzazione per partecipare al progetto, compresi i costi diretti e indiretti. Questo importo è un sottoinsieme del bilancio complessivo del progetto.

Nessun dato

Partecipanti (5)

Il mio fascicolo 0 0