The McGurk Effect

Rich Shipe, longtime reader and commenter on this blog, referred to this video in a chapel message that he gave, making the point that our perceptions often fall far short of the truth.   I had to show it to all of you.  (As you watch, try shutting your eyes and notice how the “fa’s” turn back to “ba’s.”)

The point seems to be that when what we see conflicts with what we hear, our brain goes with what we see.  Can you think of applications of this principle?

UPDATE:  Some of you, as is evident from the comments, are not getting this at all.  The speaker is not making the “b” sound when his teeth touch his lips.  That would be impossible.  The same “b” sound is being played throughout.  It’s just that when we see his teeth touch his lips, our brain makes us hear the “f” sound even though the sound being played is “b.”

[Some remedial linguistics:  The "b" sound is a "voiced bilabial plosive," referring to air popping out of the two lips with the vocal cords in your neck buzzing.  The "f" sound is an "unvoiced labial dental fricative," meaning it's made by the teeth ("dental") touching one lip ("labial"), with air coming out through the closure in a form of friction.  "Unvoiced" means the vocal cords are not engaged.  When they ARE engaged and buzzing--feel your Adam's apple, that's your vocal cords--you get the voiced labial dental fricative; that is a "v" sound, what tODD thinks he heard, which is legitimate, but it also confirms the effect.)  This physiology is complicated, and yet apparently our brains pick up these distinctions visually!]

[If this video is not showing up on your browser, click "comments" and it should.]

About Gene Veith

Professor of Literature at Patrick Henry College, the Director of the Cranach Institute at Concordia Theological Seminary, a columnist for World Magazine and TableTalk, and the author of 18 books on different facets of Christianity & Culture.

  • http://enterthevein.wordpress.com J. Dean

    If the guy in the video accentuated the “f” sound when he did the “fa” motion with his mouth, it would be a little different.

    Then again, there’s something to be said for these sounds being put within a contextually relevant setting (a sentence, fo example)

  • http://enterthevein.wordpress.com J. Dean

    If the guy in the video accentuated the “f” sound when he did the “fa” motion with his mouth, it would be a little different.

    Then again, there’s something to be said for these sounds being put within a contextually relevant setting (a sentence, fo example)

  • Dan Kempin

    This is fascinating. I’m not sure what it means, but it is fascinating. I was especially amazed to see the split screen and hear something different each time I changed my focus.

  • Dan Kempin

    This is fascinating. I’m not sure what it means, but it is fascinating. I was especially amazed to see the split screen and hear something different each time I changed my focus.

  • Jonathan

    Sorry, but it’s hard to believe that it’s “exactly the same” -ba- sound when forcing it through an unnatural formation.

    I agree that it’s more likely to associate an -fa- sound in that situation based on sight alone. But I bet there is a slight difference in sound produced that also contributes as a cue to the effect.

    It would be interesting to know if they have tested blind and gotten similar results, based on sound alone.

  • Jonathan

    Sorry, but it’s hard to believe that it’s “exactly the same” -ba- sound when forcing it through an unnatural formation.

    I agree that it’s more likely to associate an -fa- sound in that situation based on sight alone. But I bet there is a slight difference in sound produced that also contributes as a cue to the effect.

    It would be interesting to know if they have tested blind and gotten similar results, based on sound alone.

  • SKPeterson

    When I shut my eyes and he was doing the faux “fa”, I noticed that the “b” sound was softer, more like a Spanish “b/v” and that the regular “ba” had a harder “b”. It seems like the last “fa” id have the harder “b”, but I have to agree with Jonathan that I don’t think you can shape your mouth differently and produce exactly the same sound. I tried it myself to see – and I can see being able to make a hard “b” with lots of practice, but generally if I make the “b” sound using the “f” ombature I got the soft “b”. There is a difference in the sound that the ear hears. But, if it is muted or not precise, the eye checks the context and says “fa” and not “ba”. I’m sure the McGurk effect is real, but I think it is truly only operative in those instances in which the context is fuzzy and multiple sensory inputs are required to make cognitive sense of a phenomenon.

  • SKPeterson

    When I shut my eyes and he was doing the faux “fa”, I noticed that the “b” sound was softer, more like a Spanish “b/v” and that the regular “ba” had a harder “b”. It seems like the last “fa” id have the harder “b”, but I have to agree with Jonathan that I don’t think you can shape your mouth differently and produce exactly the same sound. I tried it myself to see – and I can see being able to make a hard “b” with lots of practice, but generally if I make the “b” sound using the “f” ombature I got the soft “b”. There is a difference in the sound that the ear hears. But, if it is muted or not precise, the eye checks the context and says “fa” and not “ba”. I’m sure the McGurk effect is real, but I think it is truly only operative in those instances in which the context is fuzzy and multiple sensory inputs are required to make cognitive sense of a phenomenon.

  • steve

    On the one hand, this is a defense mechanism. If we had to consciously interpret every stimuli we’d have died out long ago. But our brains assume much of what’s going on around us–based on the context of our surroundings, our previous experiences, our preoccupations, and probably some innate programming–so that we can focus on what we really need. On the other hand, the fact that we assume much more about what we see and hear than what is actually being shown or said is a fact that’s been exploited by advertisers for decades. Probably for as long as the trade of advertising itself has been in existence.

  • steve

    On the one hand, this is a defense mechanism. If we had to consciously interpret every stimuli we’d have died out long ago. But our brains assume much of what’s going on around us–based on the context of our surroundings, our previous experiences, our preoccupations, and probably some innate programming–so that we can focus on what we really need. On the other hand, the fact that we assume much more about what we see and hear than what is actually being shown or said is a fact that’s been exploited by advertisers for decades. Probably for as long as the trade of advertising itself has been in existence.

  • http://www.facebook.com/mesamike Mike Westfall

    So… when someone is talking to you, you should close your eyes and rely on context to determine whether the person is saying “father” or “bother.”

    After all, the person you’re talking to might be trying to trick you into believing the he said something totally incomprehensible…

  • http://www.facebook.com/mesamike Mike Westfall

    So… when someone is talking to you, you should close your eyes and rely on context to determine whether the person is saying “father” or “bother.”

    After all, the person you’re talking to might be trying to trick you into believing the he said something totally incomprehensible…

  • http://www.toddstadler.com/ tODD

    Jonathan (@3), I’m not sure you’ve understood what they did when you say:

    Sorry, but it’s hard to believe that it’s “exactly the same” -ba- sound when forcing it through an unnatural formation.

    The “ba” sound wasn’t being “forced it through an unnatural formation” — they overdubbed a recording of a man saying “ba” onto a video of a man saying (or at least mouthing) “fa”. The audio was always literally a man saying “ba”.

  • http://www.toddstadler.com/ tODD

    Jonathan (@3), I’m not sure you’ve understood what they did when you say:

    Sorry, but it’s hard to believe that it’s “exactly the same” -ba- sound when forcing it through an unnatural formation.

    The “ba” sound wasn’t being “forced it through an unnatural formation” — they overdubbed a recording of a man saying “ba” onto a video of a man saying (or at least mouthing) “fa”. The audio was always literally a man saying “ba”.

  • http://www.toddstadler.com/ tODD

    Of course, I have reason to want to prove this assertion wrong, but I do feel like, if I focused really hard when I watched the man saying “fa”, I could hear him saying “va”, or even a Spanish “ba” — as opposed to the “fa” I initially believed him to be saying.

    I’m merely speculating, but I wonder if the issue on this particular effect (“ba”/”fa”) is that it’s easy to miss the fairly quiet sibilance with which an F sound usually begins. All the more so when you believe you’re listening to the audio of a man at a noisy shore carnival, where high frequency noises are in abundance.

    The idea being that your mind sees a man saying “fa”, doesn’t quite hear that, but assumes that it must have simply misheard part, and so fills in the “missing” data for you. That’s what our minds do so very often — mostly to our benefit, but occasionally to our befuddlement. After all, if our brains didn’t fill in missing data for us, the movies (or, heck, flip-books) wouldn’t be nearly as fun.

  • http://www.toddstadler.com/ tODD

    Of course, I have reason to want to prove this assertion wrong, but I do feel like, if I focused really hard when I watched the man saying “fa”, I could hear him saying “va”, or even a Spanish “ba” — as opposed to the “fa” I initially believed him to be saying.

    I’m merely speculating, but I wonder if the issue on this particular effect (“ba”/”fa”) is that it’s easy to miss the fairly quiet sibilance with which an F sound usually begins. All the more so when you believe you’re listening to the audio of a man at a noisy shore carnival, where high frequency noises are in abundance.

    The idea being that your mind sees a man saying “fa”, doesn’t quite hear that, but assumes that it must have simply misheard part, and so fills in the “missing” data for you. That’s what our minds do so very often — mostly to our benefit, but occasionally to our befuddlement. After all, if our brains didn’t fill in missing data for us, the movies (or, heck, flip-books) wouldn’t be nearly as fun.

  • Austin

    Here’s another example I saw a while back From Nova-Science now’:

  • Austin

    Here’s another example I saw a while back From Nova-Science now’:

  • DonS

    My understanding, as well, was that the “ba” sound was dubbed, and you are literally hearing the exact same sound each time.

  • DonS

    My understanding, as well, was that the “ba” sound was dubbed, and you are literally hearing the exact same sound each time.

  • WisdomLover

    I don’t think you are literally hearing the same sound each time. You are manifestly hearing a different sound each time. The compression waveform that passes through the air to your ear is, of course, the same. What this shows is that sound is not a compression waveform that passes through the air. Instead, sound is a mental event that regularly occurs along with the waves. Yet another example of the perils of induction.

  • WisdomLover

    I don’t think you are literally hearing the same sound each time. You are manifestly hearing a different sound each time. The compression waveform that passes through the air to your ear is, of course, the same. What this shows is that sound is not a compression waveform that passes through the air. Instead, sound is a mental event that regularly occurs along with the waves. Yet another example of the perils of induction.

  • mendicus

    Wild. It really worked, even after I knew about it. I stared at the guy’s mouth and willed myself to hear “Bah,” but couldn’t. Then I closed my eyes and it was clear.

  • mendicus

    Wild. It really worked, even after I knew about it. I stared at the guy’s mouth and willed myself to hear “Bah,” but couldn’t. Then I closed my eyes and it was clear.

  • Dan Kempin

    Not that I have any expert knowledge, but I wonder if this is more an issue of how the brain processes language rather than noise. In other words, I’m not sure I would call it a perception deception, but a recognition that language is processed as more than sound. I had the wonderful experience of learning sign language, and I wonder if it is not the case that our brain begins to process the visual cues of communication before the sound arrives to confirm it. I suspect that subconsciously we are better lip readers and nonverbal communicators than we realize.

  • Dan Kempin

    Not that I have any expert knowledge, but I wonder if this is more an issue of how the brain processes language rather than noise. In other words, I’m not sure I would call it a perception deception, but a recognition that language is processed as more than sound. I had the wonderful experience of learning sign language, and I wonder if it is not the case that our brain begins to process the visual cues of communication before the sound arrives to confirm it. I suspect that subconsciously we are better lip readers and nonverbal communicators than we realize.

  • Thankful

    I’m new here :-)

    Leaving aside the mechanics of the phenomenon, the video leads me to consider how God chooses to relate to us in this Age…through the Word. There seems to be a theme in scripture that sets word above sight. I am thinking of ‘faith comes by hearing’, ‘Thou shalt not make unto thee any graven image’, ‘we walk by faith, not by sight’.

    I have heard that we see things in composite and our brain fills in what is there. It seems like God deals in words and Satan deals in images. I often ponder this.

    Blessings and I enjoy reading here…

  • Thankful

    I’m new here :-)

    Leaving aside the mechanics of the phenomenon, the video leads me to consider how God chooses to relate to us in this Age…through the Word. There seems to be a theme in scripture that sets word above sight. I am thinking of ‘faith comes by hearing’, ‘Thou shalt not make unto thee any graven image’, ‘we walk by faith, not by sight’.

    I have heard that we see things in composite and our brain fills in what is there. It seems like God deals in words and Satan deals in images. I often ponder this.

    Blessings and I enjoy reading here…

  • Shane A

    Wow–that is fascinating. By itself the video means little (I suspect), but I think it hints at a larger point: human consciousness actively participates in forming, or imagining, the “outside” world. Perception itself is an act.

  • Shane A

    Wow–that is fascinating. By itself the video means little (I suspect), but I think it hints at a larger point: human consciousness actively participates in forming, or imagining, the “outside” world. Perception itself is an act.

  • Pingback: The McGurk Effect: Is Seeing Believing? – Justin Taylor

  • Pingback: The McGurk Effect: Is Seeing Believing? – Justin Taylor

  • Raymond

    Fascinating. The comments by SKPeterson seem to be right on target. Is the sound being made by the speaker, or by another source? What does this mean? Not quite sure but it is a dead ringer against using video screens for preaching! Sight overpowers sound.

  • Raymond

    Fascinating. The comments by SKPeterson seem to be right on target. Is the sound being made by the speaker, or by another source? What does this mean? Not quite sure but it is a dead ringer against using video screens for preaching! Sight overpowers sound.

  • Steven

    A couple of comments:

    1) Some of the other comments are correct that the acoustic sound waves for the ba and fa in this experiment are in fact always the same ba – you only hear it as fa (or va, or a “soft ba”) because of the visual cues.

    2) It isn’t simply a case of vision “taking over” our perception of what we hear. The visual and auditory cues work together. Another example of the McGurk effect has a video of someone saying “ga” (pronounced at the back of the palate), with an audio of the same person saying “ba” (pronounced with the lips at the front of the mouth). Usually, what you hear is “da” (pronounced in the middle/front of the palate) – neither the visual nor the audio cues take over, but the two are merged into an intermediate percept.

    3) It is therefore correct that what you hear is not determined by the incoming audio signal alone, but other factors (such as visual cues) can affect the percept. On the other hand, audio and visual usually agree in nature (the McGurk Effect has only been demonstrated in highly unnatural laboratory experiments where the audio and visual have been artificially dissociated), and as a general rule you *can* trust what you hear. And what you see.

  • Steven

    A couple of comments:

    1) Some of the other comments are correct that the acoustic sound waves for the ba and fa in this experiment are in fact always the same ba – you only hear it as fa (or va, or a “soft ba”) because of the visual cues.

    2) It isn’t simply a case of vision “taking over” our perception of what we hear. The visual and auditory cues work together. Another example of the McGurk effect has a video of someone saying “ga” (pronounced at the back of the palate), with an audio of the same person saying “ba” (pronounced with the lips at the front of the mouth). Usually, what you hear is “da” (pronounced in the middle/front of the palate) – neither the visual nor the audio cues take over, but the two are merged into an intermediate percept.

    3) It is therefore correct that what you hear is not determined by the incoming audio signal alone, but other factors (such as visual cues) can affect the percept. On the other hand, audio and visual usually agree in nature (the McGurk Effect has only been demonstrated in highly unnatural laboratory experiments where the audio and visual have been artificially dissociated), and as a general rule you *can* trust what you hear. And what you see.

  • Matthew

    I recall vaguely an instance some years ago (15-20?) when a Japanese student was shot fatally on Halloween because he was unintentionally on the property of someone who mistook him for a burglar. The shooter shouted “Freeze!” but the Japanese student didn’t stop, so he shot him. I remember reading an analysis at the time (but have not the expertise to confirm or dispute) that Japanese speakers have difficulty distinguishing “p” from “f” and “r” from “l”, leading to jokes that we’ve all heard. Possibly that student heard “Please!” instead of “Freeze!” and didn’t know how to respond. I wonder how a native speaker of Japanese would hear this video? I also wonder whether a similar effect can be demonstrated with other similar but related sounds? For instance, “t” and “d,” or among various vowels.

  • Matthew

    I recall vaguely an instance some years ago (15-20?) when a Japanese student was shot fatally on Halloween because he was unintentionally on the property of someone who mistook him for a burglar. The shooter shouted “Freeze!” but the Japanese student didn’t stop, so he shot him. I remember reading an analysis at the time (but have not the expertise to confirm or dispute) that Japanese speakers have difficulty distinguishing “p” from “f” and “r” from “l”, leading to jokes that we’ve all heard. Possibly that student heard “Please!” instead of “Freeze!” and didn’t know how to respond. I wonder how a native speaker of Japanese would hear this video? I also wonder whether a similar effect can be demonstrated with other similar but related sounds? For instance, “t” and “d,” or among various vowels.

  • http://www.geneveith.com Gene Veith

    And, of course, there is a Wikipedia article about this:
    http://en.wikipedia.org/wiki/McGurk_effect

  • http://www.geneveith.com Gene Veith

    And, of course, there is a Wikipedia article about this:
    http://en.wikipedia.org/wiki/McGurk_effect

  • http://Thoughtsfrommyreformedself.com Cindy Stokes

    I was a linguistics major in college, so this was really fun to watch and ponder. Applications? What about those dressed like sheep that speak like wolves? This effect would explain a lot!

  • http://Thoughtsfrommyreformedself.com Cindy Stokes

    I was a linguistics major in college, so this was really fun to watch and ponder. Applications? What about those dressed like sheep that speak like wolves? This effect would explain a lot!

  • LB

    Fascinating! The human brain is an amazing creation…

  • LB

    Fascinating! The human brain is an amazing creation…

  • Pingback: What we half perceive and half create | Cranach: The Blog of Veith

  • Pingback: What we half perceive and half create | Cranach: The Blog of Veith


CLOSE | X

HIDE | X