Contrary to popular believe, English has more than five or ten vowel sounds.
The actual number is disputed because of disagreements over when two sounds are sufficiently distinct to be classified as separate sounds.
I ran across a podcast episode recently that mentioned a sentence that demonstrates a different English vowel sound in each word: Who would know naught of art must learn, act, and then take his ease.
The hosts noted that to get all the vowels in, you need to read the sentence with non-rhotic pronunciation, i.
e.
suppressing the r in art.
I’ll run this sentence through some software that returns the phonetic spelling of each word in IPA symbols to see the distinct vowel sounds that way.
First I’ll use Python, then Mathematica Python Let’s run this through some Python code that converts English words to IPA notation so we can look at the vowels.
import eng_to_ipa as ipa text = “Who would know naught of art must learn, act, and then take his ease.
” print(ipa.
convert(text)) This gives us hu wʊd noʊ nɔt əv ɑrt məst lərn, ækt, ənd ðɛn teɪk hɪz iz Which includes the following vowel symbols: u ʊ oʊ ɔ ə ɑ ə ə æ ə ɛ eɪ ɪ i This has some duplicates: 5, 7, 8, and 10 are all schwa symbols.
By default the eng_to_ipa gives one way to write each word in IPA notation.
There is an optional argument, retrieve_all that defaults to False but may return more alternatives when set to True.
However, in our example the only difference is that the second alternative writes and as ænd rather than ənd.
It looks like the eng_to_ipa module doesn’t transcribe vowels with sufficient resolution to distinguish some of the sounds in the model sequence.
For example, it doesn’t seem to distinguish the stressed sound ʌ from the unstressed ə.
Mathematica Here’s Mathematica code to split the model sentence into words and show the IPA pronunciation of each word.
text = “who would know naught of art must learn, act, and then take his ease” ipa[w_] := WordData[w, “PhoneticForm”] Map[ipa, TextWords[text]] This returns {“hˈu”, “wˈʊd”, “nˈoʊ”, “nˈɔt”, “ˈʌv”, “ˈɒrt”, “mˈʌst”, “lˈɝn”, “ˈækt”, “ˈænd”, “ðˈɛn”, “tˈeɪk”, “hˈɪz”, “ˈiz”} By the way, I had to write the first word as “who” because WordData won’t do it for me.
If you ask for ipa[“Who”] Mathematica will return Missing[“NotAvailable”] though it works as expected if you send it “who” rather than “Who.
” Let’s remove the stress marks and join the words together so we can compare the Python and Mathematica output.
The top line is from Python and the bottom is from Mathematica.
hu wʊd noʊ nɔt əv ɑrt məst lərn ækt ænd ðɛn teɪk hɪz iz hu wʊd noʊ nɔt ʌv ɒrt mʌst lɝn ækt ænd ðɛn teɪk hɪz iz There are a few differences, summarized in the table below.
Since the symbols are a little difficult to tell apart, I’ve included their Unicode code points.
|——-+————+————-| | Word | Python | Mathematica | |——-+————+————-| | of | ə (U+0259) | ʌ (U+028C) | | must | ə (U+0259) | ʌ (U+028C) | | art | ɑ (U+0251) | ɒ (U+0252) | | learn | ə (U+0259) | ɝ (U+025D) | |——-+————+————-| Mathematica makes some distinctions that Python missed, and uses more of the non-rhotic pronounciation that the model sentence is intended to use.
More linguistics posts Estimating vocabulary size with Heaps law Writing down an unwritten language Chinese character frequency and entropy.