Building bridges
The new method proposed by Bear and Harvey in their recent paper, takes the best of both worlds. They have developed a learning algorithm that interprets visual signals in two stages. First lip-movements are interpreted as visemes. In the second stage the visemes are 'translated' into phonemes. This can be done by looking at context, for instance. The \p\,\b\,\m\ viseme is probably a \p\ in a sequence followed by 'rince' to form the word prince.

Dr Bear said: “We are still learning the science of visual speech and what it is people need to know to create a fool-proof recognition model for lip-reading, but this classification system improves upon previous lip-reading methods by using a novel training method for the classifiers."

Prof Harvey said: “Lip-reading is one of the most challenging problems in artificial intelligence so it’s great to make progress on one of the trickier aspects, which is how to train machines to recognise the appearance and shape of human lips.”
3/3
Loading comments...
related items