Typical Viewer Perception of Lip Sync Errors

Typical Viewer Perception Of Lip Sync Errors
Errors Are Not Noticeable
Errors Are Subconsciously Disturbing
Threshold Of Detectability
Errors Are Annoying
-40 -20 0
20 +40 +60 +80 +100 +120 +140 +160
Audio to Video Offset (ms)
Perception Of Lip Sync Errors
Subconscious Effects
Although individual viewers may have slightly
different thresholds for detecting lip sync errors,
the subjective acceptability of program material
has the general characteristics shown above.
The most disturbing lip sync errors are those we
do not "see". For most viewers, if the audio is
less than 40 ms early and less than 90 ms late,
the brain does not consciously register a lip sync
error. However, studies I have shown that errors
below this detectability threshold can have the
subconscious effect of making the program
material less believable, less honest and less
trustworthy. Therefore, when audio is early by
20-40 ms or late by 40-80 ms the seemingly
"invisible" lip sync error can still be detrimental.
Our brains are at least two times more sensitive to
early audio than to late audio. The slow speed of
sound relative to the speed of light in our natural
world has conditioned us to expect audible events
to occur after the corresponding visible events.
Unfortunately, even when the lip sync error is
large enough to be clearly visible (the red region)
our brains are unable to identify the size of the
error with any accuracy. Correcting the lip sync
becomes a subjective and time consuming trial
and error process.
LipTracker® provides rapid objective results over
its measurement range without the limitation of a
threshold of detectahllity.
The Zero Tolerance Goal
Small errors in the green region may not be a
problem. But the cumulative effect of cascaded
small errors can quickly put you into the magenta
or red regions (subconscious or visibly annoying).
Therefore, audio to video offsets should be kept
as close to zero as possible at all stages of the
production and distribution paths.
Effects of Audio-Video Asynchrony on Viewer's
Memory, Evaluation of Content and Detection
Research Report Prepared for Pixel Instruments
Byron Reeves & David Voelker
Stanford University
October 1993