This site has been archived on
You are here: CORDIS / IST web / Content / KCT / Projects / VIDI-VIDEO

Knowledge & Content Technologies

Find the most recent information on EU Funding activities in the field of Information and Communication Technologies (ICT) by visiting our ICT in FP7 website , which covers ICT in the 7 th Framework Programme (FP7) 2007 - 2013.

Go to the VIDI-VIDEO Website

VIDI-VIDEO - Interactive semantic video search with a large thesaurus of machine-learned audio-visual concepts

VIDI-VIDEO will substantially enhance access to video, by developing a semantic search engine. The project will boost the performance of video search by developing a thesaurus for automatically detecting instances of semantic concepts in the audio-visual content.


The scientific impact is to achieve semantic video retrieval by learning a very large thesaurus of concepts. The technological impact is to improve indexing and retrieval practices currently employed by broadcasting archivists. The societal impact is to increase the access capability to information.

Main innovation

Video is vital to society and economy. It plays a key role in the information distribution and access and it will soon be the natural form of communication for the Internet and mobile phones. Current search engines, however, all rely on keyword-based access leaving semantic access to the data to research.

VIDI-Video aims at boosting the performance of video search by forming a 1,000 detector thesaurus aiming to localize the corresponding semantic concepts in the audio, visual or combined stream of data. The approach is to let the system learn many, mostly weak, semantic detectors instead of modeling a few of them carefully. These detectors will describe different aspects of the video content. In combination they will render a rich basis for interactive access to the video library.

Results so far

VIDI-Video starts by integrating state of the art components from machine learning, audio detection, video processing, interaction and visualization into a system which competed successfully in TRECVID?s interactive search competition (the TRECVID is a benchmark on video search organized by the American National Institute of Standards and Technology). The project aims to improve especially on machine learning techniques, visual and audio analysis techniques and effective interaction.

Concrete outputs will be a fully implemented audio-visual search engine , consisting of two main parts, viz. a learning system and a runtime system , where the former will feed its results into the latter after each round of training-and-thesaurus-update. The learning system will consist of software to be developed for overall video processing; visual analysis; audio analysis; integrated feature detector; and multimedia query + user interface . All subsystems will be delivered and available both as stand-alone and integrated into these two final, connected systems. The modularity and contemporary stand-alone status of each system warrant developmental independence, and an efficient exploitation, as commercial opportunities often target components rather than entire systems.

More details
Upcoming work
  • Exploitation plan, including initial sections of State of the Art, and Market Reports on video processing
  • Video processing software, first version
  • Video enriched with shot segmentation and representation
  • Video enriched with audio descriptors
  • Audio analysis software, first version
  • Setting up user groups
  • Legal problematic evaluation of the copyright and the author rights
  • Publicity material & community building strategy
  • Public Website
  • Plan of use and disseminating knowledge
Administrative Details
  • VIDI-VIDEO (IST-045547) is a Specific Targeted Research Project of the European Union's 6th Framework Programme - call 6.
  • The project started on 1 February 2007 and finishes on 31 January 2010.
  • There are 8 partners from 6 European countries involved in the project, and the overall funding is 2.79 million euro.
List of Participants
  • Project Coordinator : Universiteit van Amsterdam, The Netherlands
  • Informatics and Telematics Institute, Greece
  • Institute for Systems and Computer Engineering, Portugal
  • University of Surrey, UK
  • Universit√† degli Studi di Firenze, Italy
  • Universitat Autonoma de Barcelona, Spain
  • Beeld en Geluid, The Netherlands
  • Fondazione Rinascimento Digitale, Italy
Contact Persons
Project Coordinator: Arnold Smeulders
Intelligent Systems Lab Amsterdam, University of Amsterdam, The Netherlands
Email: Arnold Smeulders
Events in connection with VIDI-VIDEO
  • ICIAP - The 14 th International Conference on Image Analysis and Processing, as organised by VIDI-VIDEO?s professor Rita Cucchiara of the Universit√† di Modena e Reggio Emilia. VIDI-VIDEO provided quite a few speakers and organised its second meeting there. September 2007, Modena, Italy.
  • VMDL07 - The 11 th DELOS Thematic Workshop on Visual and Multimedia Digital Libraries, organised by the DELOS network of excellence. Members of VIDI-VIDEO presented their work. September 2007, Modena, Italy.
  • NEM - The European Technology Platform, 5 th general assembly meeting. October 2007, Brussels.
  • IBC2007 ? The International Broadcasting Conference. September 2007, Amsterdam.
  • PICNIC 2007 . This event was devoted to creativity and innovation in the media, technology and entertainment industries. September 2007, Amsterdam.
  • ICCV2007 - The international conference on computer vision, October 2007, Rio de Janeiro.
  • TREC video 2007 . The event is sponsored by the National Institute of Standards and Technology NIST with additional support from other U.S. government agencies. The goal of the conference series is to encourage research in information retrieval for organizations interested in comparing their results. November 2007, Gaitherburg, USA.