Why vowels sound different from each other

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
12
down vote

favorite
3












This might be a basic question but I am confused about how mouth shapes for vowels, at a deeper level, are producing different sounds. Wanted to see if one could demonstrate with another instrument like a pipe, how you could create the vowel sounds.



For example, IPA divides the sounds into front and back of the mouth/tongue blocking airflow, and open/closed-ness of the air passage by the tongue. This produces the ee /i/ sound, the oo sound, the ah sound, the oh sound, and everything in between. But I'm wondering how it actually works. How the shape of the mouth actually produces these sounds. For example, I don't know if we can just hear a recording of an ee in the middle of its pronunciation, and tell it's an ee sound. Maybe we can only tell they are the vowel sounds because of their relative sound in relation to each other, I don't know.



To further demonstrate my confusion, take for example a hose. If you squeeze the hose, the water comes out faster. That makes sense. Because there is less space for the water to come out of. There is a complete explanation there. Likewise, when an ambulance passes by and you hear the doppler effect, that is because the sound waves are compressed as they come toward you, and expanded as they move away. That gives a complete explanation of why the sound changes as the ambulance passes by.



But a vowel sound like /i/ or /a/ doesn't make sense just by saying "the shape of the mouth is x". Why does the shape of the mouth produce that sound, and what is the sound it is producing. I'm wondering if it is because the mouth cavity is curved, or because the sound wave is a complex shape (would be nice to see what the vowel sound waves look like).



If I try to describe the "ah" sound, and say it's because the "mouth is shaped like x, and the tongue is placed here...", that doesn't explain why that is producing the "ah" sound vs. the "ee" sound. Something that would fully explain it is by saying "the open cavity in the mouth produces a sound wave that is shaped in what we hear as an "ah" sound ". Or "the ee" is two sine waves in this formation" sort of thing. Also, for the "oo" sound, I don't see how "rounding the lips" produces that sound. I'm wondering what it does to the sound waves to give it that "O" sound.



I'm wondering if one can describe how the basic vowel sounds get their distinct "sound", or what that sound even is.



As a reference, I understand how tone (musically) can be expressed, because it's defined by the sound waves. Same with loudness / amplitude. But vowel sounds, I don't fully get yet.










share|improve this question



















  • 1




    Think of vowels as chords. You've probably encountered "F₁" and "F₂" in your reading. Each of these is a fundamental tone, and every vowel is formed from two characteristic F's. How are they formed? The configuration of the tongue and lips and the angle of the jaw break the mouth into two resonating chambers, each of which produces one. Find a map of tone values over the vowel triangle to see what I mean.
    – jlawler
    2 days ago






  • 3




    You're on the right track. You know about pitch ("tone") and amplitude and duration. What's left? Timbre. Why do a guitar, a bass and an ukelele all sound different even if they're played at the same pitch, loudness and duration? Basically, because they have different shapes. Each shape dampens or reinforces different components of the complex sound. If the mouth shape of the /a/ were the big resonant box of a guitar, that of /i/ would be of a bright ukelele. You want to read on the acoustics of timbre, resonance, formants. This book may help.
    – boiko
    2 days ago











  • I'm not saying that you don't understand this, but your terminology is inaccurate and requires explanation: Technically, the vowels you mention are all pulmonary — which means that the production of the airflow comes from the lungs, and more specifically by expulsion of air. From there, it is the vocal cords which actually produce the sound. The apparatuses in and around the mouth act to change the shape of the resonating chamber — not to produce the sound.
    – can-ned_food
    yesterday















up vote
12
down vote

favorite
3












This might be a basic question but I am confused about how mouth shapes for vowels, at a deeper level, are producing different sounds. Wanted to see if one could demonstrate with another instrument like a pipe, how you could create the vowel sounds.



For example, IPA divides the sounds into front and back of the mouth/tongue blocking airflow, and open/closed-ness of the air passage by the tongue. This produces the ee /i/ sound, the oo sound, the ah sound, the oh sound, and everything in between. But I'm wondering how it actually works. How the shape of the mouth actually produces these sounds. For example, I don't know if we can just hear a recording of an ee in the middle of its pronunciation, and tell it's an ee sound. Maybe we can only tell they are the vowel sounds because of their relative sound in relation to each other, I don't know.



To further demonstrate my confusion, take for example a hose. If you squeeze the hose, the water comes out faster. That makes sense. Because there is less space for the water to come out of. There is a complete explanation there. Likewise, when an ambulance passes by and you hear the doppler effect, that is because the sound waves are compressed as they come toward you, and expanded as they move away. That gives a complete explanation of why the sound changes as the ambulance passes by.



But a vowel sound like /i/ or /a/ doesn't make sense just by saying "the shape of the mouth is x". Why does the shape of the mouth produce that sound, and what is the sound it is producing. I'm wondering if it is because the mouth cavity is curved, or because the sound wave is a complex shape (would be nice to see what the vowel sound waves look like).



If I try to describe the "ah" sound, and say it's because the "mouth is shaped like x, and the tongue is placed here...", that doesn't explain why that is producing the "ah" sound vs. the "ee" sound. Something that would fully explain it is by saying "the open cavity in the mouth produces a sound wave that is shaped in what we hear as an "ah" sound ". Or "the ee" is two sine waves in this formation" sort of thing. Also, for the "oo" sound, I don't see how "rounding the lips" produces that sound. I'm wondering what it does to the sound waves to give it that "O" sound.



I'm wondering if one can describe how the basic vowel sounds get their distinct "sound", or what that sound even is.



As a reference, I understand how tone (musically) can be expressed, because it's defined by the sound waves. Same with loudness / amplitude. But vowel sounds, I don't fully get yet.










share|improve this question



















  • 1




    Think of vowels as chords. You've probably encountered "F₁" and "F₂" in your reading. Each of these is a fundamental tone, and every vowel is formed from two characteristic F's. How are they formed? The configuration of the tongue and lips and the angle of the jaw break the mouth into two resonating chambers, each of which produces one. Find a map of tone values over the vowel triangle to see what I mean.
    – jlawler
    2 days ago






  • 3




    You're on the right track. You know about pitch ("tone") and amplitude and duration. What's left? Timbre. Why do a guitar, a bass and an ukelele all sound different even if they're played at the same pitch, loudness and duration? Basically, because they have different shapes. Each shape dampens or reinforces different components of the complex sound. If the mouth shape of the /a/ were the big resonant box of a guitar, that of /i/ would be of a bright ukelele. You want to read on the acoustics of timbre, resonance, formants. This book may help.
    – boiko
    2 days ago











  • I'm not saying that you don't understand this, but your terminology is inaccurate and requires explanation: Technically, the vowels you mention are all pulmonary — which means that the production of the airflow comes from the lungs, and more specifically by expulsion of air. From there, it is the vocal cords which actually produce the sound. The apparatuses in and around the mouth act to change the shape of the resonating chamber — not to produce the sound.
    – can-ned_food
    yesterday













up vote
12
down vote

favorite
3









up vote
12
down vote

favorite
3






3





This might be a basic question but I am confused about how mouth shapes for vowels, at a deeper level, are producing different sounds. Wanted to see if one could demonstrate with another instrument like a pipe, how you could create the vowel sounds.



For example, IPA divides the sounds into front and back of the mouth/tongue blocking airflow, and open/closed-ness of the air passage by the tongue. This produces the ee /i/ sound, the oo sound, the ah sound, the oh sound, and everything in between. But I'm wondering how it actually works. How the shape of the mouth actually produces these sounds. For example, I don't know if we can just hear a recording of an ee in the middle of its pronunciation, and tell it's an ee sound. Maybe we can only tell they are the vowel sounds because of their relative sound in relation to each other, I don't know.



To further demonstrate my confusion, take for example a hose. If you squeeze the hose, the water comes out faster. That makes sense. Because there is less space for the water to come out of. There is a complete explanation there. Likewise, when an ambulance passes by and you hear the doppler effect, that is because the sound waves are compressed as they come toward you, and expanded as they move away. That gives a complete explanation of why the sound changes as the ambulance passes by.



But a vowel sound like /i/ or /a/ doesn't make sense just by saying "the shape of the mouth is x". Why does the shape of the mouth produce that sound, and what is the sound it is producing. I'm wondering if it is because the mouth cavity is curved, or because the sound wave is a complex shape (would be nice to see what the vowel sound waves look like).



If I try to describe the "ah" sound, and say it's because the "mouth is shaped like x, and the tongue is placed here...", that doesn't explain why that is producing the "ah" sound vs. the "ee" sound. Something that would fully explain it is by saying "the open cavity in the mouth produces a sound wave that is shaped in what we hear as an "ah" sound ". Or "the ee" is two sine waves in this formation" sort of thing. Also, for the "oo" sound, I don't see how "rounding the lips" produces that sound. I'm wondering what it does to the sound waves to give it that "O" sound.



I'm wondering if one can describe how the basic vowel sounds get their distinct "sound", or what that sound even is.



As a reference, I understand how tone (musically) can be expressed, because it's defined by the sound waves. Same with loudness / amplitude. But vowel sounds, I don't fully get yet.










share|improve this question















This might be a basic question but I am confused about how mouth shapes for vowels, at a deeper level, are producing different sounds. Wanted to see if one could demonstrate with another instrument like a pipe, how you could create the vowel sounds.



For example, IPA divides the sounds into front and back of the mouth/tongue blocking airflow, and open/closed-ness of the air passage by the tongue. This produces the ee /i/ sound, the oo sound, the ah sound, the oh sound, and everything in between. But I'm wondering how it actually works. How the shape of the mouth actually produces these sounds. For example, I don't know if we can just hear a recording of an ee in the middle of its pronunciation, and tell it's an ee sound. Maybe we can only tell they are the vowel sounds because of their relative sound in relation to each other, I don't know.



To further demonstrate my confusion, take for example a hose. If you squeeze the hose, the water comes out faster. That makes sense. Because there is less space for the water to come out of. There is a complete explanation there. Likewise, when an ambulance passes by and you hear the doppler effect, that is because the sound waves are compressed as they come toward you, and expanded as they move away. That gives a complete explanation of why the sound changes as the ambulance passes by.



But a vowel sound like /i/ or /a/ doesn't make sense just by saying "the shape of the mouth is x". Why does the shape of the mouth produce that sound, and what is the sound it is producing. I'm wondering if it is because the mouth cavity is curved, or because the sound wave is a complex shape (would be nice to see what the vowel sound waves look like).



If I try to describe the "ah" sound, and say it's because the "mouth is shaped like x, and the tongue is placed here...", that doesn't explain why that is producing the "ah" sound vs. the "ee" sound. Something that would fully explain it is by saying "the open cavity in the mouth produces a sound wave that is shaped in what we hear as an "ah" sound ". Or "the ee" is two sine waves in this formation" sort of thing. Also, for the "oo" sound, I don't see how "rounding the lips" produces that sound. I'm wondering what it does to the sound waves to give it that "O" sound.



I'm wondering if one can describe how the basic vowel sounds get their distinct "sound", or what that sound even is.



As a reference, I understand how tone (musically) can be expressed, because it's defined by the sound waves. Same with loudness / amplitude. But vowel sounds, I don't fully get yet.







phonetics vowels






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago









Dylan

1031




1031










asked 2 days ago









Lance Pollard

494210




494210







  • 1




    Think of vowels as chords. You've probably encountered "F₁" and "F₂" in your reading. Each of these is a fundamental tone, and every vowel is formed from two characteristic F's. How are they formed? The configuration of the tongue and lips and the angle of the jaw break the mouth into two resonating chambers, each of which produces one. Find a map of tone values over the vowel triangle to see what I mean.
    – jlawler
    2 days ago






  • 3




    You're on the right track. You know about pitch ("tone") and amplitude and duration. What's left? Timbre. Why do a guitar, a bass and an ukelele all sound different even if they're played at the same pitch, loudness and duration? Basically, because they have different shapes. Each shape dampens or reinforces different components of the complex sound. If the mouth shape of the /a/ were the big resonant box of a guitar, that of /i/ would be of a bright ukelele. You want to read on the acoustics of timbre, resonance, formants. This book may help.
    – boiko
    2 days ago











  • I'm not saying that you don't understand this, but your terminology is inaccurate and requires explanation: Technically, the vowels you mention are all pulmonary — which means that the production of the airflow comes from the lungs, and more specifically by expulsion of air. From there, it is the vocal cords which actually produce the sound. The apparatuses in and around the mouth act to change the shape of the resonating chamber — not to produce the sound.
    – can-ned_food
    yesterday













  • 1




    Think of vowels as chords. You've probably encountered "F₁" and "F₂" in your reading. Each of these is a fundamental tone, and every vowel is formed from two characteristic F's. How are they formed? The configuration of the tongue and lips and the angle of the jaw break the mouth into two resonating chambers, each of which produces one. Find a map of tone values over the vowel triangle to see what I mean.
    – jlawler
    2 days ago






  • 3




    You're on the right track. You know about pitch ("tone") and amplitude and duration. What's left? Timbre. Why do a guitar, a bass and an ukelele all sound different even if they're played at the same pitch, loudness and duration? Basically, because they have different shapes. Each shape dampens or reinforces different components of the complex sound. If the mouth shape of the /a/ were the big resonant box of a guitar, that of /i/ would be of a bright ukelele. You want to read on the acoustics of timbre, resonance, formants. This book may help.
    – boiko
    2 days ago











  • I'm not saying that you don't understand this, but your terminology is inaccurate and requires explanation: Technically, the vowels you mention are all pulmonary — which means that the production of the airflow comes from the lungs, and more specifically by expulsion of air. From there, it is the vocal cords which actually produce the sound. The apparatuses in and around the mouth act to change the shape of the resonating chamber — not to produce the sound.
    – can-ned_food
    yesterday








1




1




Think of vowels as chords. You've probably encountered "F₁" and "F₂" in your reading. Each of these is a fundamental tone, and every vowel is formed from two characteristic F's. How are they formed? The configuration of the tongue and lips and the angle of the jaw break the mouth into two resonating chambers, each of which produces one. Find a map of tone values over the vowel triangle to see what I mean.
– jlawler
2 days ago




Think of vowels as chords. You've probably encountered "F₁" and "F₂" in your reading. Each of these is a fundamental tone, and every vowel is formed from two characteristic F's. How are they formed? The configuration of the tongue and lips and the angle of the jaw break the mouth into two resonating chambers, each of which produces one. Find a map of tone values over the vowel triangle to see what I mean.
– jlawler
2 days ago




3




3




You're on the right track. You know about pitch ("tone") and amplitude and duration. What's left? Timbre. Why do a guitar, a bass and an ukelele all sound different even if they're played at the same pitch, loudness and duration? Basically, because they have different shapes. Each shape dampens or reinforces different components of the complex sound. If the mouth shape of the /a/ were the big resonant box of a guitar, that of /i/ would be of a bright ukelele. You want to read on the acoustics of timbre, resonance, formants. This book may help.
– boiko
2 days ago





You're on the right track. You know about pitch ("tone") and amplitude and duration. What's left? Timbre. Why do a guitar, a bass and an ukelele all sound different even if they're played at the same pitch, loudness and duration? Basically, because they have different shapes. Each shape dampens or reinforces different components of the complex sound. If the mouth shape of the /a/ were the big resonant box of a guitar, that of /i/ would be of a bright ukelele. You want to read on the acoustics of timbre, resonance, formants. This book may help.
– boiko
2 days ago













I'm not saying that you don't understand this, but your terminology is inaccurate and requires explanation: Technically, the vowels you mention are all pulmonary — which means that the production of the airflow comes from the lungs, and more specifically by expulsion of air. From there, it is the vocal cords which actually produce the sound. The apparatuses in and around the mouth act to change the shape of the resonating chamber — not to produce the sound.
– can-ned_food
yesterday





I'm not saying that you don't understand this, but your terminology is inaccurate and requires explanation: Technically, the vowels you mention are all pulmonary — which means that the production of the airflow comes from the lungs, and more specifically by expulsion of air. From there, it is the vocal cords which actually produce the sound. The apparatuses in and around the mouth act to change the shape of the resonating chamber — not to produce the sound.
– can-ned_food
yesterday











3 Answers
3






active

oldest

votes

















up vote
22
down vote



accepted










Good question! This comes down to formants.



Any periodic sound (from a violin, a trumpet, a guitar, or a human voice, among many many others) can be written as the sum of a whole bunch of sine waves at different "pitches" (frequencies). The mathematical details are too complicated to include here, but if you're interested, look into the harmonic series, and the Fourier transform.



The lowest of these pitches is called F0, the "fundamental frequency", and that's what people normally mean when they talk about the pitch of a sound. But the other pitches above it, called "harmonics", are what make a violin sound different from a trumpet, even when they're playing the same fundamental pitch (the same note). The human vocal chords produce harmonics a bit like a violin's: a saw tooth wave. And while we can change the fundamental frequency, we don't really have enough control of the vocal chords to change the harmonics.



Fortunately, there's more to our vocal tract than just the vocal chords!



As you know, pronouncing a vowel basically involves constricting the mouth and throat at various different places. And where exactly the constriction happens affects these harmonics, applying a "filter" to the sound from our vocal chords.



In particular, the constrictions create "formants": groups of harmonics that are louder than others. And using the Fourier transform, we can plot exactly where these formants lie. It turns out that F1, the first formant, corresponds quite elegantly to vowel height (how high the tongue is in the mouth), F2 to vowel backness (where exactly the tongue is closest to the roof of the mouth), and F3 a bit less elegantly to rounding.



diagram of formants



This is a diagram from some of my own research: the blue line shows the amplitude of each harmonic, the red line is the "envelope" (a way of describing the filter effect), and the peaks are the formants.



As a matter of fact, you can even create recognizable vowels with a plastic tube. If you use a more flexible tube, you can even change the vowel while "playing" it, by squeezing in appropriate places.






share|improve this answer





























    up vote
    7
    down vote













    Just wanted to add this diagram which shows the subjective vowel sounds as they correspond to the combination of F1 and F2 formants in a two-dimensional chart. The chart is from this page of the National Center for Voice and Speech's website.



    enter image description here



    F1 is typically generated at the front of the mouth and F2 typically in the throat or back of the mouth, both controlled mostly by placement of the tongue. These vowel sounds can be further shaped by rounding of the lips (F3), though that isn't as distinct in English as it is in some other languages.



    The formant frequencies as they are called are basically filters that apply to the fundamental frequency which is set up by the vocal cords. The fundamental frequency F0 does not carry linguistic signals and can be varied, for example when singing, or when adding intonation (considered a metalinguistic signal).



    You can, in fact, simulate vowel sounds on a synthesizer by filtering a base oscillator using two band pass filters set at F1 and F2 formant frequencies.






    share|improve this answer










    New contributor




    Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.













    • 3




      That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
      – Draconis
      2 days ago










    • @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
      – Mitch
      yesterday

















    up vote
    6
    down vote













    I think what you are asking about is "what exactly is a vowel sound?" (and eventually you'd like to know how to compute that answer from knowing what the lips and tongue are doing". Here's a short answer, aimed at explaining how to design a kind of vowel synthesizer. Voiced sounds start with a sound source created by vocal fold vibrations, let's say 100 Hz. This gives you a complex wave with components at 100Hz, 200Hz, 300Hz etc, and you specify the amplitude for each component. Overall, the amplitude slopes downward as frequency increases, so that there is basically nothing at frequencies above 7000 Hz. There are "resonances", i.e. frequency bands, where the amplitude is higher, so that a band of about 6 harmonics centered around 400 Hz, the amplitude will be higher, then is falls off, and then around 2000 Hz there is another increase in amplitude, and again at 2400 and 3500. These are the formants (F1, F2, F3, F4). By fiddling with the frequency where these amplitude peaks occur, you can make different-sounding vowel noises. You can also make some strange noises that don't sound like possible human vowels.



    The idea here is that you compute sine waves at integer multiples of the fundamental frequency, and magnify the basic sine function by a coefficient representing an appropriate amplitude; then sum up all of the sine waves. (Then convert to a wav file and play). Use Praat to view waveforms and get spectral analysis. There are fancier ways of synthesizing vowels, but I believe that simple "adding the components" is the best way to get your head around the conceptual nature of complex waves. And use Johnson's book.






    share|improve this answer






















    • Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
      – Draconis
      2 days ago










    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "312"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f28962%2fwhy-vowels-sound-different-from-each-other%23new-answer', 'question_page');

    );

    Post as a guest






























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    22
    down vote



    accepted










    Good question! This comes down to formants.



    Any periodic sound (from a violin, a trumpet, a guitar, or a human voice, among many many others) can be written as the sum of a whole bunch of sine waves at different "pitches" (frequencies). The mathematical details are too complicated to include here, but if you're interested, look into the harmonic series, and the Fourier transform.



    The lowest of these pitches is called F0, the "fundamental frequency", and that's what people normally mean when they talk about the pitch of a sound. But the other pitches above it, called "harmonics", are what make a violin sound different from a trumpet, even when they're playing the same fundamental pitch (the same note). The human vocal chords produce harmonics a bit like a violin's: a saw tooth wave. And while we can change the fundamental frequency, we don't really have enough control of the vocal chords to change the harmonics.



    Fortunately, there's more to our vocal tract than just the vocal chords!



    As you know, pronouncing a vowel basically involves constricting the mouth and throat at various different places. And where exactly the constriction happens affects these harmonics, applying a "filter" to the sound from our vocal chords.



    In particular, the constrictions create "formants": groups of harmonics that are louder than others. And using the Fourier transform, we can plot exactly where these formants lie. It turns out that F1, the first formant, corresponds quite elegantly to vowel height (how high the tongue is in the mouth), F2 to vowel backness (where exactly the tongue is closest to the roof of the mouth), and F3 a bit less elegantly to rounding.



    diagram of formants



    This is a diagram from some of my own research: the blue line shows the amplitude of each harmonic, the red line is the "envelope" (a way of describing the filter effect), and the peaks are the formants.



    As a matter of fact, you can even create recognizable vowels with a plastic tube. If you use a more flexible tube, you can even change the vowel while "playing" it, by squeezing in appropriate places.






    share|improve this answer


























      up vote
      22
      down vote



      accepted










      Good question! This comes down to formants.



      Any periodic sound (from a violin, a trumpet, a guitar, or a human voice, among many many others) can be written as the sum of a whole bunch of sine waves at different "pitches" (frequencies). The mathematical details are too complicated to include here, but if you're interested, look into the harmonic series, and the Fourier transform.



      The lowest of these pitches is called F0, the "fundamental frequency", and that's what people normally mean when they talk about the pitch of a sound. But the other pitches above it, called "harmonics", are what make a violin sound different from a trumpet, even when they're playing the same fundamental pitch (the same note). The human vocal chords produce harmonics a bit like a violin's: a saw tooth wave. And while we can change the fundamental frequency, we don't really have enough control of the vocal chords to change the harmonics.



      Fortunately, there's more to our vocal tract than just the vocal chords!



      As you know, pronouncing a vowel basically involves constricting the mouth and throat at various different places. And where exactly the constriction happens affects these harmonics, applying a "filter" to the sound from our vocal chords.



      In particular, the constrictions create "formants": groups of harmonics that are louder than others. And using the Fourier transform, we can plot exactly where these formants lie. It turns out that F1, the first formant, corresponds quite elegantly to vowel height (how high the tongue is in the mouth), F2 to vowel backness (where exactly the tongue is closest to the roof of the mouth), and F3 a bit less elegantly to rounding.



      diagram of formants



      This is a diagram from some of my own research: the blue line shows the amplitude of each harmonic, the red line is the "envelope" (a way of describing the filter effect), and the peaks are the formants.



      As a matter of fact, you can even create recognizable vowels with a plastic tube. If you use a more flexible tube, you can even change the vowel while "playing" it, by squeezing in appropriate places.






      share|improve this answer
























        up vote
        22
        down vote



        accepted







        up vote
        22
        down vote



        accepted






        Good question! This comes down to formants.



        Any periodic sound (from a violin, a trumpet, a guitar, or a human voice, among many many others) can be written as the sum of a whole bunch of sine waves at different "pitches" (frequencies). The mathematical details are too complicated to include here, but if you're interested, look into the harmonic series, and the Fourier transform.



        The lowest of these pitches is called F0, the "fundamental frequency", and that's what people normally mean when they talk about the pitch of a sound. But the other pitches above it, called "harmonics", are what make a violin sound different from a trumpet, even when they're playing the same fundamental pitch (the same note). The human vocal chords produce harmonics a bit like a violin's: a saw tooth wave. And while we can change the fundamental frequency, we don't really have enough control of the vocal chords to change the harmonics.



        Fortunately, there's more to our vocal tract than just the vocal chords!



        As you know, pronouncing a vowel basically involves constricting the mouth and throat at various different places. And where exactly the constriction happens affects these harmonics, applying a "filter" to the sound from our vocal chords.



        In particular, the constrictions create "formants": groups of harmonics that are louder than others. And using the Fourier transform, we can plot exactly where these formants lie. It turns out that F1, the first formant, corresponds quite elegantly to vowel height (how high the tongue is in the mouth), F2 to vowel backness (where exactly the tongue is closest to the roof of the mouth), and F3 a bit less elegantly to rounding.



        diagram of formants



        This is a diagram from some of my own research: the blue line shows the amplitude of each harmonic, the red line is the "envelope" (a way of describing the filter effect), and the peaks are the formants.



        As a matter of fact, you can even create recognizable vowels with a plastic tube. If you use a more flexible tube, you can even change the vowel while "playing" it, by squeezing in appropriate places.






        share|improve this answer














        Good question! This comes down to formants.



        Any periodic sound (from a violin, a trumpet, a guitar, or a human voice, among many many others) can be written as the sum of a whole bunch of sine waves at different "pitches" (frequencies). The mathematical details are too complicated to include here, but if you're interested, look into the harmonic series, and the Fourier transform.



        The lowest of these pitches is called F0, the "fundamental frequency", and that's what people normally mean when they talk about the pitch of a sound. But the other pitches above it, called "harmonics", are what make a violin sound different from a trumpet, even when they're playing the same fundamental pitch (the same note). The human vocal chords produce harmonics a bit like a violin's: a saw tooth wave. And while we can change the fundamental frequency, we don't really have enough control of the vocal chords to change the harmonics.



        Fortunately, there's more to our vocal tract than just the vocal chords!



        As you know, pronouncing a vowel basically involves constricting the mouth and throat at various different places. And where exactly the constriction happens affects these harmonics, applying a "filter" to the sound from our vocal chords.



        In particular, the constrictions create "formants": groups of harmonics that are louder than others. And using the Fourier transform, we can plot exactly where these formants lie. It turns out that F1, the first formant, corresponds quite elegantly to vowel height (how high the tongue is in the mouth), F2 to vowel backness (where exactly the tongue is closest to the roof of the mouth), and F3 a bit less elegantly to rounding.



        diagram of formants



        This is a diagram from some of my own research: the blue line shows the amplitude of each harmonic, the red line is the "envelope" (a way of describing the filter effect), and the peaks are the formants.



        As a matter of fact, you can even create recognizable vowels with a plastic tube. If you use a more flexible tube, you can even change the vowel while "playing" it, by squeezing in appropriate places.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited yesterday









        jknappen

        8,75121842




        8,75121842










        answered 2 days ago









        Draconis

        6,441731




        6,441731




















            up vote
            7
            down vote













            Just wanted to add this diagram which shows the subjective vowel sounds as they correspond to the combination of F1 and F2 formants in a two-dimensional chart. The chart is from this page of the National Center for Voice and Speech's website.



            enter image description here



            F1 is typically generated at the front of the mouth and F2 typically in the throat or back of the mouth, both controlled mostly by placement of the tongue. These vowel sounds can be further shaped by rounding of the lips (F3), though that isn't as distinct in English as it is in some other languages.



            The formant frequencies as they are called are basically filters that apply to the fundamental frequency which is set up by the vocal cords. The fundamental frequency F0 does not carry linguistic signals and can be varied, for example when singing, or when adding intonation (considered a metalinguistic signal).



            You can, in fact, simulate vowel sounds on a synthesizer by filtering a base oscillator using two band pass filters set at F1 and F2 formant frequencies.






            share|improve this answer










            New contributor




            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.













            • 3




              That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
              – Draconis
              2 days ago










            • @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
              – Mitch
              yesterday














            up vote
            7
            down vote













            Just wanted to add this diagram which shows the subjective vowel sounds as they correspond to the combination of F1 and F2 formants in a two-dimensional chart. The chart is from this page of the National Center for Voice and Speech's website.



            enter image description here



            F1 is typically generated at the front of the mouth and F2 typically in the throat or back of the mouth, both controlled mostly by placement of the tongue. These vowel sounds can be further shaped by rounding of the lips (F3), though that isn't as distinct in English as it is in some other languages.



            The formant frequencies as they are called are basically filters that apply to the fundamental frequency which is set up by the vocal cords. The fundamental frequency F0 does not carry linguistic signals and can be varied, for example when singing, or when adding intonation (considered a metalinguistic signal).



            You can, in fact, simulate vowel sounds on a synthesizer by filtering a base oscillator using two band pass filters set at F1 and F2 formant frequencies.






            share|improve this answer










            New contributor




            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.













            • 3




              That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
              – Draconis
              2 days ago










            • @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
              – Mitch
              yesterday












            up vote
            7
            down vote










            up vote
            7
            down vote









            Just wanted to add this diagram which shows the subjective vowel sounds as they correspond to the combination of F1 and F2 formants in a two-dimensional chart. The chart is from this page of the National Center for Voice and Speech's website.



            enter image description here



            F1 is typically generated at the front of the mouth and F2 typically in the throat or back of the mouth, both controlled mostly by placement of the tongue. These vowel sounds can be further shaped by rounding of the lips (F3), though that isn't as distinct in English as it is in some other languages.



            The formant frequencies as they are called are basically filters that apply to the fundamental frequency which is set up by the vocal cords. The fundamental frequency F0 does not carry linguistic signals and can be varied, for example when singing, or when adding intonation (considered a metalinguistic signal).



            You can, in fact, simulate vowel sounds on a synthesizer by filtering a base oscillator using two band pass filters set at F1 and F2 formant frequencies.






            share|improve this answer










            New contributor




            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            Just wanted to add this diagram which shows the subjective vowel sounds as they correspond to the combination of F1 and F2 formants in a two-dimensional chart. The chart is from this page of the National Center for Voice and Speech's website.



            enter image description here



            F1 is typically generated at the front of the mouth and F2 typically in the throat or back of the mouth, both controlled mostly by placement of the tongue. These vowel sounds can be further shaped by rounding of the lips (F3), though that isn't as distinct in English as it is in some other languages.



            The formant frequencies as they are called are basically filters that apply to the fundamental frequency which is set up by the vocal cords. The fundamental frequency F0 does not carry linguistic signals and can be varied, for example when singing, or when adding intonation (considered a metalinguistic signal).



            You can, in fact, simulate vowel sounds on a synthesizer by filtering a base oscillator using two band pass filters set at F1 and F2 formant frequencies.







            share|improve this answer










            New contributor




            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            share|improve this answer



            share|improve this answer








            edited yesterday









            kubanczyk

            1052




            1052






            New contributor




            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            answered 2 days ago









            Octopus

            1772




            1772




            New contributor




            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.





            New contributor





            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            Octopus is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.







            • 3




              That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
              – Draconis
              2 days ago










            • @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
              – Mitch
              yesterday












            • 3




              That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
              – Draconis
              2 days ago










            • @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
              – Mitch
              yesterday







            3




            3




            That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
            – Draconis
            2 days ago




            That's an excellent diagram, but it might be clearer if rotated and flipped to match the IPA vowel chart.
            – Draconis
            2 days ago












            @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
            – Mitch
            yesterday




            @Draconis There are too many to choose from!. But the IPA/vowel front-back-high-low triangle is easily seen in most of them. In fact, pretty convenient that it is so straightforward: larger F1 freq -> lower tongue, higher F2 freq -> closer to front.
            – Mitch
            yesterday










            up vote
            6
            down vote













            I think what you are asking about is "what exactly is a vowel sound?" (and eventually you'd like to know how to compute that answer from knowing what the lips and tongue are doing". Here's a short answer, aimed at explaining how to design a kind of vowel synthesizer. Voiced sounds start with a sound source created by vocal fold vibrations, let's say 100 Hz. This gives you a complex wave with components at 100Hz, 200Hz, 300Hz etc, and you specify the amplitude for each component. Overall, the amplitude slopes downward as frequency increases, so that there is basically nothing at frequencies above 7000 Hz. There are "resonances", i.e. frequency bands, where the amplitude is higher, so that a band of about 6 harmonics centered around 400 Hz, the amplitude will be higher, then is falls off, and then around 2000 Hz there is another increase in amplitude, and again at 2400 and 3500. These are the formants (F1, F2, F3, F4). By fiddling with the frequency where these amplitude peaks occur, you can make different-sounding vowel noises. You can also make some strange noises that don't sound like possible human vowels.



            The idea here is that you compute sine waves at integer multiples of the fundamental frequency, and magnify the basic sine function by a coefficient representing an appropriate amplitude; then sum up all of the sine waves. (Then convert to a wav file and play). Use Praat to view waveforms and get spectral analysis. There are fancier ways of synthesizing vowels, but I believe that simple "adding the components" is the best way to get your head around the conceptual nature of complex waves. And use Johnson's book.






            share|improve this answer






















            • Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
              – Draconis
              2 days ago














            up vote
            6
            down vote













            I think what you are asking about is "what exactly is a vowel sound?" (and eventually you'd like to know how to compute that answer from knowing what the lips and tongue are doing". Here's a short answer, aimed at explaining how to design a kind of vowel synthesizer. Voiced sounds start with a sound source created by vocal fold vibrations, let's say 100 Hz. This gives you a complex wave with components at 100Hz, 200Hz, 300Hz etc, and you specify the amplitude for each component. Overall, the amplitude slopes downward as frequency increases, so that there is basically nothing at frequencies above 7000 Hz. There are "resonances", i.e. frequency bands, where the amplitude is higher, so that a band of about 6 harmonics centered around 400 Hz, the amplitude will be higher, then is falls off, and then around 2000 Hz there is another increase in amplitude, and again at 2400 and 3500. These are the formants (F1, F2, F3, F4). By fiddling with the frequency where these amplitude peaks occur, you can make different-sounding vowel noises. You can also make some strange noises that don't sound like possible human vowels.



            The idea here is that you compute sine waves at integer multiples of the fundamental frequency, and magnify the basic sine function by a coefficient representing an appropriate amplitude; then sum up all of the sine waves. (Then convert to a wav file and play). Use Praat to view waveforms and get spectral analysis. There are fancier ways of synthesizing vowels, but I believe that simple "adding the components" is the best way to get your head around the conceptual nature of complex waves. And use Johnson's book.






            share|improve this answer






















            • Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
              – Draconis
              2 days ago












            up vote
            6
            down vote










            up vote
            6
            down vote









            I think what you are asking about is "what exactly is a vowel sound?" (and eventually you'd like to know how to compute that answer from knowing what the lips and tongue are doing". Here's a short answer, aimed at explaining how to design a kind of vowel synthesizer. Voiced sounds start with a sound source created by vocal fold vibrations, let's say 100 Hz. This gives you a complex wave with components at 100Hz, 200Hz, 300Hz etc, and you specify the amplitude for each component. Overall, the amplitude slopes downward as frequency increases, so that there is basically nothing at frequencies above 7000 Hz. There are "resonances", i.e. frequency bands, where the amplitude is higher, so that a band of about 6 harmonics centered around 400 Hz, the amplitude will be higher, then is falls off, and then around 2000 Hz there is another increase in amplitude, and again at 2400 and 3500. These are the formants (F1, F2, F3, F4). By fiddling with the frequency where these amplitude peaks occur, you can make different-sounding vowel noises. You can also make some strange noises that don't sound like possible human vowels.



            The idea here is that you compute sine waves at integer multiples of the fundamental frequency, and magnify the basic sine function by a coefficient representing an appropriate amplitude; then sum up all of the sine waves. (Then convert to a wav file and play). Use Praat to view waveforms and get spectral analysis. There are fancier ways of synthesizing vowels, but I believe that simple "adding the components" is the best way to get your head around the conceptual nature of complex waves. And use Johnson's book.






            share|improve this answer














            I think what you are asking about is "what exactly is a vowel sound?" (and eventually you'd like to know how to compute that answer from knowing what the lips and tongue are doing". Here's a short answer, aimed at explaining how to design a kind of vowel synthesizer. Voiced sounds start with a sound source created by vocal fold vibrations, let's say 100 Hz. This gives you a complex wave with components at 100Hz, 200Hz, 300Hz etc, and you specify the amplitude for each component. Overall, the amplitude slopes downward as frequency increases, so that there is basically nothing at frequencies above 7000 Hz. There are "resonances", i.e. frequency bands, where the amplitude is higher, so that a band of about 6 harmonics centered around 400 Hz, the amplitude will be higher, then is falls off, and then around 2000 Hz there is another increase in amplitude, and again at 2400 and 3500. These are the formants (F1, F2, F3, F4). By fiddling with the frequency where these amplitude peaks occur, you can make different-sounding vowel noises. You can also make some strange noises that don't sound like possible human vowels.



            The idea here is that you compute sine waves at integer multiples of the fundamental frequency, and magnify the basic sine function by a coefficient representing an appropriate amplitude; then sum up all of the sine waves. (Then convert to a wav file and play). Use Praat to view waveforms and get spectral analysis. There are fancier ways of synthesizing vowels, but I believe that simple "adding the components" is the best way to get your head around the conceptual nature of complex waves. And use Johnson's book.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 2 days ago

























            answered 2 days ago









            user6726

            28.6k11654




            28.6k11654











            • Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
              – Draconis
              2 days ago
















            • Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
              – Draconis
              2 days ago















            Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
            – Draconis
            2 days ago




            Good answer! Mind adding the title of Johnson's book and/or an Amazon link?
            – Draconis
            2 days ago

















             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f28962%2fwhy-vowels-sound-different-from-each-other%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            Long meetings (6-7 hours a day): Being “babysat” by supervisor

            What does second last employer means? [closed]

            Confectionery