Audio-visual interaction (2)

Research topics include:

- automatic lip reading,

- speech-driven face animation,

- lip tracking,

- joint audio-video coding

- bimodal person authentication.

Automated lip reading systems are based on audio and visual speech recognizers. Visual speech recognizers are based on HMMs or time delayed neural networks (TDNNs). They employ:

- features of binary mouth images,such as height, width, perimeter,

along with their derivatives, or

- active shape models, or

- the aforementioned geometric parameters combined with the

wavelet transform of the mouth images.

Previous slide Next slide Back to first slide View graphic version