Classification of visual speech features

Word=time sequence of visemes (mouth states) => temporal evolution of visual speech features is important for recognition:
- need for dynamic classifiers, to model transitions between mouth states = model temporal evolution of data:
- HMM (Hidden Markov Models): the most frequently used:
- TDNN (Time-Delayed Neural Networks): can perform recognition based on the temporal variation of visual speech features, not only static values.

Aristotle University of Thessaloniki

Word=time sequence of visemes (mouth states) => temporal evolution of visual speech features is important for recognition: