Pronunciation – Why Are There Three Pronunciations for the Plural ‘-s’?

phoneticsphonologyplural-formspossessivespronunciation

I know all the pronunciation rules for the plural -s endings. After a voiced sound, it is z, after an unvoiced one it is s, after s, sh, ch it is iz. In phonetic notation, respectively, /z/, /s/, /ɪz/.

Some examples to illustrate what I said above:

  1. Word, hill, king, queen + S: wordz, hillz, kingz, queenz
  2. Book, drink, hat, cup + S: books, drinks, hats, cups
  3. Bus, dish, beach + ES: buses, dishes, beaches: here the "es" is pronounced /ɪz/

I was explaining it to someone and they asked me why there were three different pronunciations for the -s. I said it was "hard to pronounce".

It is hard to pronounce an S after a D. Surely, it can be easy for some, but generally hard.

Is there any compelling reason why there are three different pronunciations for this?

Best Answer

TLDR

The short answer is that there are certain rules regarding what kind of sound sequences are possible in English, if we used a single pronunciation for the -s endings in every situation, we would end up with ill-formed (and hard-to-pronounce) sequences of sounds, therefore we use three different sounds for the -s in order to conform with those rules. Those rules are called Phonotactic rules.

  • The S is pronounced [z] when the preceding sound is voiced because the sounds in the end of a syllable must agree in voicing, according to English phonotactics
  • The S is pronounced [s] when the preceding sound is voiceless for the same reason as above
  • The S is pronounced [ɪz ~ əz] when the preceding sound is a 'sibilant' (a consonant that has a hissing effect) such as [s, z, ʃ, ʒ, t͡ʃ, d͡ʒ] because two sibilants can't occur next to each other in the same syllable, so we insert a vowel ([ɪ] or [ə]) between both the sibilants to break that cluster.

Explanation

Phonotactics

Every language has a unique set of rules that determine the permissible sequences of sounds. That set of rules is called 'Phonotactic rules' (or 'Phonotactic constraints'). A Dictionary of Phonetics and Phonology by R. L. Trask defines phonotactics as 'the set of constraints on the possible sequences of consonant and vowel phonemes within a word, a morpheme or a syllable' (p277). In simple words, it studies the possible sequences of sounds and the positions where they can be found.

A sequence of sounds that is allowed in one language may be disfavoured in another language, for instance, the Polish word wszczniesz is pronounced /fʂt͡ʂɲɛʂ/; the sequence of sounds at the beginning of this word is allowed in Polish but not in English.

English phonotactic constraints

There are loads of restrictions on syllable structure in Modern English, some of which are:

  • No [ŋ] in the onset
  • Obstruents in the coda must agree in voicing
  • Two sibilants in the same syllable cannot occur next to each other
  • No PLOSIVE + NASAL sequence in the same syllable
  • No /h/ in the coda
  • No tautosyllabic geminates1
  • No affricates2 or /h/ in complex onsets

more constraints here at Wikipedia

Voiceless sound + S

When the plural marker (or third person singular or possessives) is attached to a word that ends in a voiceless sound, it is pronounced not as [z] but as [s] and that's because it would violate the rules I mentioned above and would be hard to pronounce as well (try saying ٭batz). However, when we change the [z] to [s] it makes it easy to pronounce the cluster and it doesn't change the meaning of the word, so we pronounce the -s as [s] after voiceless consonants.

Voiced sound + S

When a word ends in a voiced sound, and we add [z], then they agree in voicing and the combination is permissible. For example, bag + [z] → [bægz] because [g] is voiced.

Sibilants + S

  • [ʃ] or [ʒ] + S

When a word ends in a sibilant [ʃ], it's voiceless and when we add the [s] then we get *[ʃs] cluster, which isn't permissible and difficult to pronounce, so we insert a vowel between both the sibilants in order to split that illicit cluster. After inserting the vowel, we get [ʃɪs], now we already said that the -s is [z] after a voiced sound, and the vowel is voiced, so we change the [s] back to a [z] and get [ɪz] therefore the word bushes is pronounced bush[ɪz]. When a word ends in [ʒ], we do the same as above.

  • [s] or [z] + S

[s] and [z] are sibilants, but I'm going to explain them separately. When a word ends in a [s], it's a voiceless sound, so we add the [s] form of the -s ending; bus + [s] → *[bʌss], here we have a geminated s and as we read in the rules, tautosyllabic geminates aren't allowed, therefore we insert an epenthetic vowel [ɪ ~ ə] to break the geminate: [bʌsɪs], we change the terminal [s] back to a [z] because the preceding sound is a voiced sound (vowels are always voiced): [ˈbʌsɪz].

The same goes for words that end in [z]: when a word ends in a [z], we add the [z] form of the -s ending because [z] is voiced: rose + [z] → *[ɹəʊzz], here we have a geminated z, so we need to split that impermissible cluster; therefore, we insert a vowel: [ˈɹəʊzɪz]

  • Affricates + S

Affricates—[t͡ʃ] and [d͡ʒ]—are complex segments. The second segment in both the affricates is a sibilant. So we get Sibilant + Sibilant, which isn't allowed. Therefore we insert a vowel between the affricate and the [s] or [z] to break that cluster. Beach + [s] → *[biːt͡ʃs] + [ɪ ~ ə] → [biːt͡ʃɪz]

It holds true for possessives and present singular -s too.


NOTES
  1. 'Tautosyllabic' means within the same syllable and 'geminate' is simply a long consonant (for example, the n in 'unnatural')
  2. Affricates are consonants that start off as a plosive (like /t/, /d/) and ends as a fricative (such as /s/, /z/ etc).

  • I've marked ill-formed and illicit sequences of sounds with a preceding asterisk (*).
Related Topic