This site has been archived on

List of participants

List of Domains and Chains
Multimedia Domain

Multimodal Verification for Teleservices and Security Applications

Main Objective
The primary goal of the M2VTS project is to address the issue of secured access to local and centralised services in a multi-media environment. The main objective is to extend the scope of application of network-based services by adding novel and intelligent functionalities, enabled by automatic verification systems combining multimodal strategies (secured access based on speech, image and other information). The objectives are also to show that limitations of individual technologies (speech recognition, speaker verification...) can be overcome by relying on multi-modal decisions (combination or fusion of these technologies) and can find practical and important applications in the new emerging fields of advanced interfaces for tele-services.
The main goals of the project are therefore : Technical Approach
The project will provide a first pilot demonstrator after 12 months, which will be tested by end-users who have expressed interest in the project (most of them are project partners). In order to do the verification, several modalities will be used. These can be categorised as being either visual-based or speech-based. Among the visual information, face is the most significant for verification and will be the basis of the image modalities to be dealt with. Traditionally, faces are handled as 2D objects, as acquired by a camera. For pose independence of the head, 3D information will also be used. The following ways of recovering 3D information will be investigated : One of the most important aspects of this project consists in combining the information from all available modalities.

Key Issues
It will extend the usability of network-based services by adding secured access. It will allow mobility by providing a service from any location. It will be validated and tested in at least three European languages. It will demonstrate novel technologies for user authentification based on speech and image recognition leading to a fusion of multimodal information. It will provide secured access on non secured networks. It will give solutions for access control (e.g. to tele-shopping, tele-banking or to buildings), surveillance as well as intrusion detection, and alarm verification.

A flexible software and hardware platform has been realised. From this platform, four demonstrators have been developed, evaluated and installed at end-user sites:
Level 1 : A network-based voice mail system with access control using robust speech recognition technology and rejection (one speech modality is used: speaker dependent password recognition)
Level 2 : The same application reinforced by two additional modalities (text-dependent and text-independent verification of the voice)
1st Level 3 : An access control to buildings realised using the level 1 system complemented with a profile recognizer
2nd Level 3 : An access control to rooms realised using the level 1 system, the profile recognizer and a face recognizer

Summary of trial
The goals have been reached, and the field tests are now being completed. Some hardware problems on the level 3 demonstrators have resulted in a one month delay on the field tests for these demonstrators. First results are anyhow currently available. The demonstrators are installed in places where the systems are publicly reachable. The level 1 and level 2 demonstrators are callable from the PSTN. The level 3 demonstrators are currently undergoing field tests. The achievements of the period cover also the novative and state of the art results obtained on the algorithmic side.
A synergy has been established with the VIDAS project (ACTS 057) and inside the MPEG4 ACTS Concertation for Facial Feature Extraction and Tracking potentially with other projects such as VANGUARD (ACTS 074) and other EUprograms projects. The AVBPA conference (Audio and Video-Based Biometric Person Authentication) has been organized by the algorithmic consortium of M2VTS, and will be attended by researchers from around the world. A special Open Day of AVBPA, the EFFACES Forum, will be held on 11 March '97 (one day before AVBPA) in order to allow EU projects working on Facial Feature Extraction and related topics to Image processing to present their main results and discuss on future common issues and action plans.


Continuing studies
Parallel to the development of the Pilot Demonstrators, algorithmic developments have started. The fields investigated include :

The flexible platforms will be used to record a real-condition database for enhancement of the algorithms.
Dedicated hardware will be used to run the algorithms in real-time for the final systems. Two final demonstrators will be realised, one stand-alone and one to be installed in a PC. Finally, an API layer is being specified for easy implementation of the various algorithms into Applications. Application Generation Tools are also being developed in order to add flexibility in the prototyping of applications covering the wide range of end user needs. Commercial Application Generation Tools will also be investigated in parallel.

Expected impact
The results of the project will feed a broad range of applications in many sectors. Particularly in the telecommunication field, the results should have a direct impact on network services as security of information and access will become increasingly important (telephone fraud in US has been recently estimated at several billion dollars).


					Gael Richard
Speech Processing Department
rue JP. Timbaud BP 26
78392 Bois d'arcy cedex
Tel:    +33 1 3460 7955
Fax:    +33 1 3460 8832
					(email removed)

List of Domains and Chains
Multimedia Domain

List of participants

Matra Communication
IMT Neuch√Ętel
Unidad Tecnica Auxiliar de la Policia
University of Surrey
University of Thessaloniki - Aristotle
University of Carlos III