Automatic feature extraction for efficient retrieval in large multimedia database


The importance attributed to the efficient organisation, management and utilisation of the information flow in Electronic publishing, in particular multimedia information, renders the image database systems with high performance retrieval facilities a pivotal issue for future attempts to develop competitive markets on a Europe wide basis. The user friendliness and "human orientation" of these systems strongly impacts on user productivity, conditioning, in term of both quality and quantity, the way they work and what they produce.

The user domain for FORMULA is Electronic Publishing, intended as those activities of publishing (e.g. press, CDs, videoclips), involving the use of electronic supports specific operations, such as the management of electronic images. Publishers are supposed to deal daily with heterogeneous multimedia data. The composition of their product requires the collection of data, such as pictures or video, from large and unstructured databases; electronic or otherwise. Handling the data is never simple or easy. Even with the support of an electronic database, insertion and retrieval of data is costly and complex. The process of data management is inefficient. Such limitations cause an excessive waste of resources, in terms of time and money and prevent a more optimised productions process. This restricts potential business. The proposed project aims to improve the electronic publishing process for all those involved; publishers, agencies, photo reporters, by facilitating enhanced management of images and the global remote access to image banks.

The results of the project will be a demonstrator for image database management, proving the viability of integrating new technologies based on automatic extraction of image annotations (during insertion) and on a pictorial content-based approach (during retrieval), for an innovative approach to image handling in electronic publishing.

It will adapt and integrate different technologies from the various related domains : Human Computer Interfaces (HCI) Pictorial Content Based Insertion and Retrieval techniques, Multimedia Database (MMDB), WWW and Internet technologies for the higher levels; ISDN (Integrated System Digital Network) and other ground technologies for handling the multimedia data), according to the identified user needs.

This demonstrator will result in a dramatic reduction of costs for electronic publishers in image manipulation (costs for image annotation, for image retrieval, for administrative issues) by:

- reducing the need of extensive textual annotation of images, reducing the image insertion to a simple scanning operation. This will save at least 75% of the time involved;
- reducing drastically the time spent for image retrieval (not less than 50%). This will be due to the higher precision of retrieved images from a given query, as well operational speed.

The project will follow a life cycle oriented approach, focusing first on user needs and then on exploitation; beginning with an extensive analysis of the user requirements. The overall system will be designed, implemented and tested/evaluated in the user sites. A rapid prototyping approach will be adopted. Demonstrator development aim to define a general solution, capable of being extended from the publishing domain to others dealing with multimedia or to other industrial sectors and processes.

