Synthesised talking face derived from speech for hearing disabled users of voice channels

Deliverables

A state-of-the-art phoneme recognition system with a very low delay has been developed for the SYNFACE prototype. The conditions faced by the recogniser for SYNFACE are adverse in many respects. The system is required to be speaker independent because the identity of the caller is not known in advance; task independent because the conversation is not restricted to any particular domain; narrow band because the conversation is supposed to happen across the telephone line; low latency because very short delay is allowed between the incoming speech and the lip movements of the avater, if the turn taking mechanism in the telephonic conversation is to be preserved.The system is based on a hybrid of recurrent neural networks (RNNs) and hidden Markov models (HMMs). The RNNs are used as frame-by-frame estimators of the posterior probability of each speech sound given the acoustic evidence. The probabilities are then fed into the HMMs that bear a model of time evolution. A Viterbi-like decoding scheme is employed in order to obtain the best phonetic sequence for a given speech segment. The recogniser could be used in many situations where a very fast recognition is useful, for example in pronunciation training software. Today the recogniser exists in versions trained for English, Swedish and Flemish.

The SYNFACE project has developed a prototype system for hard of hearing telephone users. It uses an artificial face to recreate the lip movements of the person at the other end of the telephone line providing lip-reading support to the user. The user will hear the speech of the person at the other end and synchronised with this see the lip movements recreated in an artificial face. SYNFACE consists of a phoneme, that is speech sound, recogniser and a visual speech synthesiser, that is an artificial talking face. The phoneme recogniser identifies the speech sounds and the face synthesiser then recreates their articulation. The movements of the talking face are synchronised with the audio speech signal and shown on a screen attached to the SYNFACE user's telephone. To use SYNFACE, only the hard of hearing user will need to have the device installed. The SYNFACE prototypes have been evaluated by users in the UK, the Netherlands and Sweden. The results are very encouraging; a large majority of the hard of hearing people who have tried SYNFACE to date have found it helpful and effective. Further results will be published on the project homepage www.speech.kth.se/synface.

Deliverables

Share this page

Download