Service Communautaire d'Information sur la Recherche et le Développement - CORDIS

FP5

CUSTODIEV Résumé de rapport

Project ID: IST-2001-37116
Financé au titre de: FP5-IST
Pays: United Kingdom

3D dynamic capture system

Non-contact real-time 3D capture is a technically demanding application. Currently the Glasgow system captures video from up to 24 cameras in real time and post processes the video streams together to compute an all-round 3D model whose appearance and behaviour matches that of the subject. For animation purposes it has been used to capture a face-and head model which is then marked up by a semi-automatic process in which MPEG-4 reference points are placed on the model, then tracked as long as possible automatically before manual correction. The idea is to capture the behaviour of salient points on the face and head and use them to calculate MPEG-4 Facial Animation Parameters (FAPs) to drive an animated face and head model. While a detailed reproduction of the surface appearance of the original face is not a requirement in this application a good 3D model is important to minimising the amount of manual intervention required and a significant reduction in mark-up time has been achieved over early efforts, to the point that this stage of post processing is now becoming comparable in duration to the initial wholly automatic stages of model construction and conformation. At the same time the FAPs tracked have been increased to include points like the irises of the eye which are hard to obtain by any other approach to motion capture.

The Glasgow 3D (markerless) dynamic capture system was originally developed as an experimental rig 3 years ago and its use in the project has focused on its evaluation as an emergent technology with potential for use in animation. The specific reason this technology was developed was to facilitate the portrayal of historical figures in animation. Improved markup tools have also improved the quality of the results obtained and most MPEG-4 FAPs, notably pupil position, are measured and passed.

The capture functionality is based on the C3D stereo-vision system using data captured by between 12 and 24 analogue cameras with real-time feeds to digitising frame-grabbers operating at 25 frames/sec. Attention has focused on head capture as the basis of evaluation. While human faces are easy to animate what is difficult is to retain a specific character, especially a known character, in the facial behaviour. In the application for which it is ultimately intended the idea is that an actor will provide the character portrayal and the animator has an opportunity to embellish this (although we foresee contractual issues arising here between actor and studio).

Dynamic capture has its place for the specific kinds of facial animation we wanted, and can possibly lead on to new styles, but the data-path will have to be streamlined beyond the limits of the present implementation. Some development will be necessary for a professional production context but much of this is to do with environment (e.g. proper provision for a director not hitherto considered for a research-oriented experimental rig). The sequences we have captured have an aspect which an animator would not normally have considered and a film actor would have been constrained to avoid.

Informations connexes

Contact

Paul SIEBERT, (Senior Research Fellow)
Tél.: +44-141-3303124
Fax: +44-141-3304913
E-mail