Multimodal meeting manager

Objective

This proposal is about processing meetings held in a room equipped with multimode sensors. The overall objective is the construction of a demonstration system to enable the offline structuring, browsing and querying of an archive of meetings. The project will include the design, collection and annotation of a multimode meetings database, the processing of audio/video streams and the integration and structuring of these streams using the outputs of various recognisers and analysers. We assume the availability of textual side information (e.g. an agenda), which enables the application of some useful constraints. The expected results of the project include a demonstrator system, and advances in models and algorithms for multimode recognition, integration and information access.

Objectives:
Construction of a system to enable structuring, browsing and querying of an archive of meetings, taking place in a room equipped with multimode sensors:
1. Development of a smart meeting room; multimode data collection and annotation;
2. Analysis and processing of audio/video streams; robust conversational speech recognition; gesture/action recognition; identification of emotion and intent; person identification; source localization and tracking;
3. Integration, structuring and information access: information management framework; multithread integration models and algorithms; meeting summarization; multimode information retrieval and extraction;
4. Construction of a demonstration system for browsing and accessing information from an archive of processed meetings;
5. Evaluation at the system and component technology level.

Work description:
The work is divided into five work packages (WPs), plus project management.
WP1 (Smart Meeting Room, Data Collection and Annotation) is concerned with the specification of the smart room environment and of data collection and annotation protocols, resulting in the M4 meeting corpus;
WP2 (Multimode Recognition) deals with the development of multimode recognisers that transform raw audio and video streams to higher level streams. The work will focus on the development of existing work (within the partners) in speech recognition and action/gesture recognition, porting to the M4 domain. It will also involve investigations regarding multimode person identification, emotion and intention recognition, and source localization and tracking. The higher level streams generated in WP2 will form the basis for the integration and information access operations of WP3.WP3 (Multimode Integration) focuses on the principled integration of multiple streams, and the development of information access methods to enable retrieval browsing and summarization from an archive of multithread meeting data;
WP3 is a key element of M4 since it forms a bridge between the multimode recognition level (WP2) and the application demonstrator (WP4);
WP4 (Demonstration and Evaluation) consists of the construction of an offline demonstration system for the Multimode Meeting Manager, along with formal and informal evaluation of the system as a whole, and its component technologies;
WP5 is concerned with Dissemination, Exploitation and Evaluation. A key aspect of this WP is the large Industrial Advisory Board set up by the project, with representatives from industrial areas which could exploit the results of M4.

Milestones:
Expected result: development of a demonstration system for structuring, browsing and querying an archive of meetings recorded in a room equipped with a variety of multimode sensors. Milestones:
1) Specification and implementation of smart meeting room environment and data collection/annotation protocol;
2) Development of multimode recognisers;
3) Development of methods for multimode integration and information access;
4) Design, implementation and evaluation of M4 demonstrator.

Fields of science

Programme(s)

FP5-IST - Programme for research, technological development and demonstration on a "User-friendly information society, 1998-2002"

Topic(s)

2001-5.1.2 - CPA2: Multimodal and multisensorial dialogue modes

Call for proposal

Data not available

Funding Scheme

CSC - Cost-sharing contracts

Coordinator

THE UNIVERSITY OF SHEFFIELD

EU contribution

No data

Address

FIRTH COURT, WESTERN BANK
S10 2TN SHEFFIELD
United Kingdom

Total cost

No data

Participants (8)

ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE

Switzerland

EU contribution

No data

IDIAP (FONDATION DE L'INSTITUT DALLE MOLLE D'INTELLIGENCE ARTIFICIELLE PERCEPTIVE)

Switzerland

EU contribution

No data

NETHERLANDS ORGANISATION FOR APPLIED SCIENTIFIC RESEARCH - TNO

Netherlands

EU contribution

No data

TECHNISCHE UNIVERSITAET MUENCHEN

Germany

EU contribution

No data

THE UNIVERSITY OF EDINBURGH

United Kingdom

EU contribution

No data

UNIVERSITE DE GENEVE

Switzerland

EU contribution

No data

UNIVERSITEIT TWENTE

Netherlands

EU contribution

No data

VYSOKE UCENI TECHNICKE V BRNE

Czechia

EU contribution

No data

Objective

Fields of science

Programme(s)

Topic(s)

Call for proposal

Funding Scheme

Coordinator

Participants (8)

Share this page

Download