John Wells’s phonetic blog: December 2010

Friday 31 December 2010

Evolving English

Yesterday I finally got round to visiting the Evolving English exhibition at the British Library. (The website also has an associated blog.)

The exhibition is excellent, well-organized and full of interest. Quite apart from the linguistic information it offers, it is fascinating to be able to see the actual Lindisfarne Gospels, an original manuscript of Beowulf, Caxton’s printing of the Canterbury Tales, a Shakespeare quarto, and — among more modern treasures — John Betjeman’s heavily revised manuscript of How to Get On in Society (‘Phone for the fish-knives, Norman’). David Crystal gives an introduction on a video loop. Over headphones I listened again to Kenneth Williams and Hugh Paddick’s polari spiel from Round the Horne.

Among the items of phonetic interest are a leaden Punch joke about h-dropping by the lower classes and an excerpt from the hard-to-believe 1929 recommendations of the BBC Advisory Committee on Spoken English.

The exhibition continues at the British Library until 3 April. Admission is free.
___

Happy New Year to all. This blog will be suspended during the whole of the month of January 2011. Next posting: 1 February.

Thursday 30 December 2010

ban legacy fonts!

Do you remember the bad old days before Unicode? The time when there was no standardized way of encoding phonetic symbols? when word processing was single-byte and fonts were 8-bit, so that any given font was limited to under two hundred characters? when the various phonetic fonts available all used different encodings, so that where one person had input ɥ another might see ɦ or ʰ or something else entirely arbitrary? when if you transferred a document to a different computer you would as likely as not get garbage for your phonetic symbols? when your Powerpoint presentation using the computer supplied by local organizers would probably fail to display your phonetic symbols properly?

Thank goodness those days are past. Nowadays we all use Unicode, the internationally agreed industry-wide font-encoding standard for all alphabets and scripts, covering all the languages of the world as well as all the phonetic (and other) symbols we might need. A single font can now contain thousands, indeed tens of thousands, of different characters. So we no longer have to keep switching fonts merely in order to include phonetic symbols. In this blog I can be confident that when I input a particular phonetic symbol you will see that same phonetic symbol on your screen, no matter where you are and no matter what platform you are using. (OK, there may be marginal cases where the font you are using falls down over one or two unusual symbols: but then you will probably see a blank square or something similar — you won’t see the wrong phonetic symbol or some ludicrous webding, as used to happen.)

I celebrated this progress and documented the details in the poster paper I gave at the 2007 International Congress of Phonetic Sciences in Saarbrücken. (If you’re interested, here’s the printed version.)

But phoneticians haven’t all caught up.

The next ICPhS is due to be held in Hong Kong in a few months’ time. The deadline for paper submission is the beginning of March, so it’s time for everyone to get their thoughts in order and start writing. The Call for Papers page on the conference website gives the following instructions about phonetic symbols in submitted papers.

• One of the following IPA fonts is to be used for congress papers:
IPA-SAM phonetic fonts: http://www.phon.ucl.ac.uk/shop/fonts.php
SIL phonetic fonts: http://scripts.sil.org/encore-ipa-download

What are these fonts, so brusquely prescribed?

The IPA-SAM fonts are 8-bit fonts that I created around fifteen years ago. Building on SIL software, they enjoyed some considerable popularity because the encoding and therefore the keyboarding fitted in nicely with the way phoneticians actually use phonetic symbols. Nevertheless, once Unicode became available it rendered these and other specialist 8-bit fonts obsolete. For the last five years or more I have been actively discouraging people from using the fonts I created, because Unicode phonetic fonts are now widely available. Indeed, more and more of the ‘core’ fonts supplied with new computers include all the IPA symbols. So everyone should now use Unicode rather than ‘legacy’ fonts like the IPA-SAM fonts.
If you follow the ICPhS link to the SIL site, you will see this notice, prominently displayed.
Important
The SIL Encore IPA and SIL IPA93 fonts are obsolete, symbol-encoded fonts. Their use is discouraged. If you decide to download and use these fonts, please note there is no user support for these fonts.
If your university or organization requires the use of these fonts, please request they change their requirement to Doulos SIL, a Unicode-encoded font which contains the complete IPA repertoire.

Yes, their use is discouraged. Did you read that, conference organizers?

The Word template supplied by the organizers for ICPhS conference papers contains the following.

Phonetic fonts
You can use phonetic symbols and special characters in your paper. To make sure that readers of your article can see the phonetic symbols in the PDF document, all special symbols must be embedded in the PDF. Depending on the software you use to produce the PDF the details may vary. In our experience the fonts are usually embedded, but this can be checked e.g. by inspecting the "Document Properties -- Fonts" in Acrobat Reader.
It is recommended to use one of the following fonts to show phonetic symbols (links for free download can also be found at the Congress website):
• IPA-SAM phonetic fonts [3]
• SIL phonetic fonts [4] (Unicode is accepted)

“Unicode is accepted.” As an afterthought. Big deal.

Where have the congress organizers been for the last ten years? Unicode should be required. And legacy fonts firmly deprecated.

Wednesday 29 December 2010

1949 revisited

The December 2010 issue of JIPA (Journal of the International Phonetic Association) celebrates its own forty years of publication and the 125th anniversary of the publishing of a journal by the IPA. Prior to 1970, the journal was known successively as Dhi Fonètik Tîtcer, dhə fənetik tîtcər, ðə fonetik tîtcər, lə mɛːtr fɔnetik and, for the seventy-five years from 1895, lə mɛːtrə fɔnetik.

To celebrate this anniversary, the current issue includes a complete scanned reproduction, with original cover pages and pagination, of the 1949 booklet Principles of the IPA. This booklet comprised (i) a theoretical introduction explaining the association’s alphabet and the principles for its use, and (ii) exemplification by phonetically transcribed texts in some fifty different languages. It is now accompanied by a short introduction written by Mike MacMahon, the IPA’s historian and archivist.

MacMahon mentions two misprints in the 1949 text, commenting that their appearance is not surprising “given the complexity of setting phonetic texts in a pre-computer age”. One is a missing diacritic. The other is the ‘problematic’ placing of a raising diacritic next to [u] (thus u˔) in the Afrikaans specimen. (Since cardinal u is by definition as close/high as is possible without crossing the vowel limit line into friction, it can hardly be raised further.)

What makes the latter more mysterious is that the same problematic combination is to be found in the specimens of Tswana and of Scottish Enɡlish — which MacMahon does not mention. Yet we know that Daniel Jones, the editor of the 1949 booklet, was careful to the point of obsessiveness about the exact typographical form of the phonetic symbols he used.

There are other misprints. In the specimen of Finnish we find riːsiu for riːsui, in that of “Roumanian” dz for dʒ, in the Welsh emlaen for əmlaen. There are doubtless others. Among factual deficits, the Japanese specimen lacks all mention of pitch accent.

Although it is not a misprint, it is shocking to find that as recently as half a century ago the name of the language Xhosa is supplemented by the now grossly offensive gloss “(Kaffir)”.

The English (“one variety of Southern British”) text of The north wind and the sun is transcribed, for illustrative purposes, in three different forms, one “broad”, one “slightly narrowed” and the third “still narrower”. This third transcription, reproduced below, contains two striking inconsistencies. One of the narrowings involves the explicit symbolization of schwa as opener (ɐ) in final position than elsewhere (ə). But if stronger is ˈstrɒŋɡɐ in the first line, why is it ˈstrɒŋɡə in the fourth? If that is a subtle observation of the effect of a close-knit following than, why, given ˈstrɒŋɡɐ and ˈtrævlɐ, is other in final position not ˈʌðɐ (line 4)? And why is the MOUTH vowel written aɷ in əˈraɷnd (line 6) but au in ˈaut (line 7)? Like Homer, DJ evidently sometimes nodded.

Tuesday 28 December 2010

merry Mary and hairy Harry

An American academic, not a phonetician but working in a related field, sufficiently eminent that I have heard his name and even read one of his books, wrote to me asking for help in puzzling out the sets of words in which he, like many other Americans, makes no distinction.

I pronounce “merry”, “Mary”, and “marry” as homophones, as many Americans do, but a good many other Americans (and, I believe, a higher proportion of Brits) pronounce all three differently.

I referred him to vol. 1 of Accents of English, which he seems delighted with.

It’s not just that “a higher proportion” of Brits distinguish the three sets. As far as I know, all do. To the best of my knowledge no native speakers of English outside north America lack the three-way distinction merry — Mary — marry (RP ˈmeri, ˈmeəri, ˈmæri). We do not rhyme sharing with herring. We do not rhyme clarity with prosperity.

Just as this fact may come as a surprise to Americans, and seem problematic and mysterious, so it can be a surprise for non-Americans to find that some Americans make no distinction. And Americans can therefore get confused over spelling in cases where we never would.

Words like merry belong under DRESS, those like marry under TRAP, and those like Mary under SQUARE, which historically is derived from FACE. So you can often tell which word belongs where by reference to the spelling. Before double rr, the spelling e indicates DRESS and a indicates TRAP. With a before single r the vowel may be SQUARE, as in vary, parent, aware, compare, garish, Carey; but because our spelling system does not consistently distinguish short and long vowels before a single consonant letter it may also indicate TRAP, as in arid, apparent, comparison, circularity, Gareth, Gary. The suffix -ary is a special case.

The DRESS-TRAP distinction, as we know, can be a trap in EFL. A Danish friend of mine, now dead, was telling me about an acquaintance whose name I heard as Berry. An unusual name, I thought, but not impossible at least as a surname. It was only years later that I discovered he was really Barry.

The day before yesterday the Sunday Times travel section had an airline cabin attendant recounting how

a very anxious-looking couple boarded a Chicago–Heathrow flight. … They were studying the route maps intensely and staring wildly around the cabin. I asked if everything was all right, and the gentleman said, in a broad Midwestern American accent, ‘Are we all going to perish?’ I thought ‘Oh dear, here we go’, and assured him that we were not, but he just became even more upset, pointing at himself and his wife, then saying ‘We’re going to perish.’ I put my hand on his shoulder and told him in my most soothing voice that there was no way that they or anyone else on the plane was going to perish, but this had the reverse effect. It was only when my supervisor came over that we realised that they were going to Paris, and hadn’t realised they had to change planes at Heathrow.

Monday 27 December 2010

countless thousands

As soon as I watched the brief preview of the Queen’s Christmas speech on Sky News I noticed her pronunciation of the phrase countless thousands (blog, 8 December).

I was not the only one. Edward Aveyard writes

I noticed your recent post on whether the Queen really uses /ai/ for MOUTH or not. If you listen from 3:40 to the Christmas message, she says “countless thousands”. I hear aʊ in countless but aɨ in thousands. … It's almost as if she had been reading your blog and wanted to give you something to analyse.

On Christmas Day I tried to download the video of the speech from the BBC website, but without success: although you could watch it you couldn’t save it. I looked on YouTube, but it wasn’t there. Now, though, TheRoyalChannel has uploaded it to YouTube: thanks, Edward, for the link.

Listen here, at 03:45, for the phrase in question.

I agree with Edward’s judgment. Watch HM’s lips in each of the two MOUTH tokens.

Another interesting pronunciation is powerful, here at 03:17.

It appears to be fully smoothed and compressed, ˈpaːfl̩. This is how I often pronounce that word myself, though some people seem disinclined to believe me when I assert that this reduction is widespread in RP. In my analysis, the “smoothing” process removes the second element of a diphthong, in this case MOUTH, when before another vowel (aʊ ə → a ə — or, of course, it could equally well have been aɨ ə → a ə). Then the “compression” process reduces two syllables to one (a.ə → aə). Finally, the monophthongization process suppresses the second element of the resulting diphthong, with compensatory lengthening of the first element (aə → aː). Thus a possible ˈpaʊ əf l̩ is reduced to ˈpaːf l̩. All three processes are variable (optional) and rule-governed (systematic).

There was a Two Ronnies sketch about misunderstandings arising from PRICE-MOUTH confusion (ground misheard as grind, etc). Can anyone locate it on YouTube or elsewhere?

Thursday 23 December 2010

Denisovans

Today’s newspapers carry news, based on a report in the science journal Nature, of DNA findings relating to an archaic group of humans, some of whose fossilized remains have been found in the Altai mountains of southern Siberia. (Here’s the Guardian’s version. There’s also an informative article on the “Denisova hominin” in Wikipedia.)

The new human ancestors were named Denisovans after the Denisova cave in the Altai Mountains where their remains were found.

The matter was duly reported on BBC R4 in this morning’s Today programme.

But how do we pronounce Denisova and its derivative Denisovan? In particular, where does the stress go? The BBC reporter stressed the second syllable, -ˈnɪs-.

The name of the cave is of course Russian, and is written in Cyrillic as Денисова. It is the feminine of Денисов Denisov, from the name Денис Denis. But the stressing of Russian patronymics ending in -ов (-ov) is notoriously unpredictable.

None of the pronunciation dictionaries I have to hand record the name. But the online resource Forvo does!

(Forvo is a website with sound files demonstrating the pronunciation of a claimed 800-thousand-odd words in 267 languages. Anyone can upload a sound file showing how they pronounce a given name or word.)

A speaker described only as “Female from Russia” pronounces Денисов clearly as dʲɪˈnʲisəf. Isn’t the internet wonderful?

Assuming that this is the regular Russian pronunciation, it follows that Денисова Denisova is dʲɪˈnʲisəvə and that we should anglicize it as dəˈnɪsəvə (or perhaps with dɪ- or de-, or indeed with -ˈniːs-). The hominins, then, are dəˈnɪsəvənz.

This was indeed the pronunciation used by the BBC presenter. Well done the BBC Pronunciation Unit.
_ _ _

Happy Christmas to everyone. Next posting: 27 December.

Wednesday 22 December 2010

a credulous scientific report

The December issue of Scientific American carries an article entitled ‘A Click of the Tongue: Ultrasound Translates Dying Languages’. (Thanks to the Clinical Linguistics blog for this link, supplied by Madalena Cruz-Ferreira.)

The article is about the use of ultrasound imaging to study articulation.

This portable technology, which became affordable to linguists around 2000, allows researchers to see the tongue as it moves in real time. It is one of the only medical scanning devices that can keep up with speech; MRIs, for example, are too slow.

Thanks to this emerging technology, [researchers] have documented some of the fastest sounds in human speech: the click consonants present in many rare African languages. Because linguists did not know exactly how the clicks were produced, the sound was placed in a “mixed-bag” category of the International Phonetic Alphabet.

Up to a point, Lord Copper. (OK, if you don't get the reference, go here.)

I wonder what evidence there is that clicks are “some of the fastest sounds in human speech”. Impressionistically, I’d have said that clicks (= sounds produced on a velaric ingressive airstream) are no faster or slower than sounds produced with pulmonic or glottalic airstream mechanisms. I suppose the claim is trivially true, in that the postalveolar (retroflex) click [!], for example, is similar in production to the plosive [ʈ], except that it involves a different airstream mechanism. And plosives are fast(ish). The dental click [|], on the other hand, is typically somewhat affricated, which means it is not so ‘fast’ a ‘sound’. And presumably pulmonic-airstream taps and flaps are the fastest of all.

It is not true that previously linguists “did not know exactly how the clicks were produced”. We can quibble about what knowing something “exactly” might be, and who the unspecified ‘linguists’ are; but phoneticians have been familiar with clicks and their production at least since the 1930s. There is a clear account of click production in, for example, Westermann and Ward’s 1933 Practical Phonetics for Students of African Languages. Their schematized diagrams are pretty good, too.

It was Pike, in his Phonetics (1943), who systematized the classification of airstream mechanisms, describing that of clicks as “velaric ingressive” (or “oral ingressive”).

Nor is it true that clicks are ‘placed in a “mixed-bag” category’ on the IPA Chart. They are in a box clearly labelled Consonants (non-pulmonic), along with the implosives and ejectives that occupy the other columns in the box.

Miller’s research, published in 2009, may well have “organized the clicks based on attributes such as airstream (where the air comes from), place (where the mouth constricts) and manner of articulation”. But in doing this she was merely following a long and established tradition. It is fifty years since I was taught this way of classifying clicks (and all other speech sounds), and I passed it on to my own students throughout my teaching career. I assume all teachers of general phonetics do likewise.

And of course ultrasound imaging in no way enables us to “translate dying languages”, though it might aid us in describing them.

Tuesday 21 December 2010

seismic

The English spelling <ei> is particularly opaque.

Other examples with the pronunciation eɪ include beige, deign, feint, rein, surveillance, vein.
Others with iː include codeine, protein, seize, Keith, Leith, Neil(l), Reid.
Others with aɪ include eider, kaleidoscope, Eileen, Brunei and German-derived words or names such as zeitgeist, Einstein. There is also the Greek-derived seismic.

At least, that’s the way I assumed everyone pronounces seismic: ˈsaɪzmɪk. But the other day I heard the politician Chris Huhne hjuːn speak of ‘a ˈseɪzmɪk shift’ in something or other.

Ancient Greek σεισμός seismós ‘a shaking, shock’ had eː during the classical period, yielding i in Modern Greek. The regular development in post-GVS English is aɪ, which we also see in dinosaur ˈdaɪn-, based on δεινός deinós ‘terrible’ (though dinosaur is of course a relatively recent coinage). Within the same discipline of geology and paleontology we also have pleistocene ˈplaɪstə(ʊ)siːn (Greek πλεῖστος pleîstos ‘most’). There is also paradise ˈpærədaɪs, borrowed from Iranian via Greek παράδεισος parádeisos.

As can be seen from these examples, the English spelling is sometimes ei and sometimes simple i. At seismic the OED comments that ‘the normal form would be *sismic’.

English is unusual among languages in that there are a large number of words whose spelling is firmly fixed, but whose pronunciation is not.

Monday 20 December 2010

carol service

The LGMC had a gig (as I have learned to say) last night singing at the annual candlelight carol service at St Leonard’s, Shoreditch. This is one of the churches that feature in the nursery rhyme Oranges and Lemons.

“When will you pay me?”
say the bells of Old Bailey.
“When I grow rich”
say the bells of Shoreditch.

The carols were all very familiar ones (to me), carols that I have known and sung every Christmas since boyhood. I was struck, though, by how many of my fellow choristers did not really know them. We were given a service sheet with the words but no music. Although we were encouraged to sing the tenor or bass part if we knew it, only a very few of us knew the four-part harmonies by heart. It is easy to forget that much of what people of my age think of as our common musical/religious heritage is no longer shared by everyone.

We also did two special numbers of our own (in my day they would have been called “anthems”), which we had all learnt by heart and knew thoroughly.

The church was packed out — apparently the regular Sunday congregation numbers only forty or so, but for this carol service there must have been three or four hundred present. Let’s hope everyone contributed generously to the church’s work with the homeless and rough sleepers.

I say the words of the carols were familiar. Well yes, but as well as modernization of the language (you instead of thou etc) the CofE’s recent drive for inclusiveness and gender-neutrality has left some odd results. It’s fine to change Good Christian men rejoice to Good Christians, all rejoice; but what about this?

What can I give him, poor as I am?
If I were a shepherd, I would bring a lamb.
If I were a wise one, I would do my part,
Yet what I can, I give him: give my heart.

The older “if I were a wise man” alludes to the three wise men, the magi. “If I were a wise one” merely makes a weak line even weaker. And I don’t see why girls as well as boys shouldn’t be encouraged to imagine themselves as one of the three wise men.

As an English version of Puer nobis nascitur, in place of the usual Unto us is born a son we had a new text which I find is by Michael Perry (1942–1996), Jesus Christ the Lord is born. You’d think that someone so contemporary would have shied away from one awful eye-rhyme:

Soon shall come the wise men three,
rousing Herod’s anger;
mothers’ hearts shall broken be
and Mary’s son in danger,
and Mary’s son in danger.

Er… ˈdeɪndʒə really doesn’t rhyme with ˈæŋɡə, surely.

Friday 17 December 2010

Assange

Jürgen Trouvain emailed me:

In the last days and weeks I've stumbled over the pronunciation of the Wikileaks activist Julian Assange in German media. Here it is common to pronounce it "the French way" as [a'sãʃ] (with the final consonant devoiced as usual in German). I wonder whether Australians and other native speakers of English also use a nasal vowel in the second syllable or whether they use [ɑn] instead - as indicated on the Wikipedia page. Any hint?

In reply I told him that British newsreaders, too, mostly seem to attempt a French-style nasalized vowel. What I hear on the radio and TV news is generally aˈsɑ̃ːʒ or something similar.

As with other French names, though, we also see some degree of anglicization through

reduction of the first vowel to ə;
replacement of the nasalized vowel by a sequence of vowel plus n;
use of an affricate dʒ rather than the fricative ʒ; and
confusion about which of the French nasalized vowels is involved.

So we also get things like əˈsɑːndʒ, aˈsɒnʒ, æˈsɔ̃ːʒ, əˈsæ̃ʒ.

As we know, standard French has up to four distinct nasalized vowels, conventionally represented in IPA as ɛ̃ œ̃ ɑ̃ ɔ̃ but in practice pronounced more like æ̃ æ̃ ɒ̃ õ. In BrE-accented French they usually come out as nasalized versions of English æ ʌ ɒ ɒ respectively. That is, we tend not to distinguish cent-sang from son-sont, though unlike many French people we do distinguish brin from brun.

When we actually borrow French words and names into English, though, a further confusion seems very typically to happen. We confuse the front nasalized vowel, French ɛ̃, with the back one, French ɑ̃ ~ ɔ̃. Thus lingerie, French lɛ̃ʒʀi, is often pronounced in English as ˈlɒnʒəri rather than the more accurate ˈlænʒəri.

It is this confusion (or the same confusion in reverse) which has enabled Steve Bell to pun on Mr Assange’s name and the French word for a monkey, singe sɛ̃ʒ. He draws both him and his lawyer as simians.

Thursday 16 December 2010

the old ones are the best

Lecturers involved in training future teachers of EFL sometimes send me very sophisticated queries about English phonetics, anxious that what they tell their students should be exactly right. This is admirable, and I am all in favour of accuracy and avoiding sloppiness. But it is good to be reminded from time to time that there are some very basic issues that many ordinary users of English as a foreign, second, or international language fail to master, partly because teachers fail to teach them.

I found this handwritten notice, in English and Greek, on engrishfunny.failblog.org, a website mainly devoted — deplorably — to laughing at foreigners, but also acting as a useful reminder of failures on the part of English language teachers (and, of course, language learners).

We can ignore for the moment the misplaced here (never put an adverb between a verb and its direct object). Our interest is phonetics.

Writing live where leave was meant shows us that those who have no iː-ɪ contrast in their L1 readily confuse the two English vowels not only in speaking and hearing but also in reading and writing.

The pair leave liːv — live (v.) lɪv is particularly tricky, since both words are of high frequency, both being among the thousand words most frequently used in spoken and written English.

Given that they are so frequent, and that their meanings are so clearly different, you might expect them to be relatively well mastered — better so than pairs of rarer words such as keeper–kipper or peach–pitch.

But I can think of more than one highly educated fluent speaker of English from a Spanish-speaking country who, like the person who scrawled this notice, gets them wrong not only in speech but also in writing.

People don’t learn the iː-ɪ distinction merely by being exposed to it. They have to be taught it explicitly. Teachers must bite the bullet and do ear training: drill the learner not just in producing the contrast, but more importantly in perceiving it. In Greece, Italy, Spain there are a few enlightened teachers of English who do that, but I am pretty confident that most don’t. Let’s encourage more of them to do so. It’s the only way.

Παρακαλώ, per favore, ¡por favor!

Wednesday 15 December 2010

Megan

The female name Megan was originally Welsh, though based on Meg, the English and Scottish short form of Margaret and a doublet of Mag, as in Maggie.

In Britain it is almost always pronounced ˈmeɡən, with the DRESS vowel.

Why, then, in AmE does it tend to be pronounced ˈmeɪɡən, with the FACE vowel?

I suppose the answer is that it is vaguely perceived as Spanish, or at least foreign, and that AmE tends to map Spanish e onto FACE. Compare Pedro, Spanish ˈpeðɾo, who becomes ˈpedrəʊ in BrE but ˈpeɪdroʊ in AmE. Alternatively, if thoroughly anglicized, Spanish e can be mapped onto iː (FLEECE), as in Toledo, OH and indeed San Pedro, CA, pronounced even by its Hispanic residents as sænˈpiːdroʊ.

Does anyone pronounce Megan as ˈmiːɡən, which I give as a third possibility in LPD?

The Renault Mégane car is usually pronounced məˈɡæn or meˈɡæn in BrE. I doubt whether it is widely known in north America.

Tuesday 14 December 2010

bimoraic vowels

Last night’s antihypnagogic moment again involved BBC R4’s Book of the Week, currently Chasing the Sun by Richard Cohen, read by Allan Corduner. It involved an account of climbing Mount Fuji to see the dawn from the summit on the longest day of the year.

At one point the narrator passed through what he called a ˈtɔːriaɪ.
The intended word was undoubtedly torii, meaning the gateway at the entrance to a Shinto shrine. But why would the narrator think that this Japanese word ought to be pronounced as if it were a Latin second-declension nominative plural, like radii?

In Japanese torii is written 鳥居 (or in hiragana とりい) and pronounced toˌɾii. (The secondary stress mark shows that this is an ‘accentless’ word, characterized when said in isolation therefore by a non-distinctive step-up of pitch on the ɾi.) In LPD I give the usual anglicization, which is ˈtɔːriiː, though arguably ˈtɒriː would be a more consistent reflection of the Japanese pronunciation.

Mind you, if England had been in regular contact with Japan since the middle ages, and if English had borrowed the word torii seven or eight hundred years ago, I can see that we might well now pronounce it ˈtɔːriaɪ. But that’s not what happened.

There is a phonemic length distinction in Japanese vowels, short vs. long. Most are short, but the second vowel in torii is long, as shown by the romanization with ii.

In the usual Hepburn romanization of Japanese, while the long versions of a e o u are written ā ē ō ū, the long i is written ii rather than the logical ī. The reason for this is not clear to me. (In practice, the macron is often omitted anyhow, so that Tōkyō comes out as Tokyo, etc. Also, some people use oh for ō when romanizing their names.)

The Japanese pitch accent is something that characterizes the mora rather than the syllable. Arguably, therefore, the ‘long’ vowels are better analysed as bimoraic rather than as long, because there might be an accent on the first part, or on the second part, or on neither. This is reflected in the kana syllabaries, where they are written as a sequence of two characters. So there are three characters in とりい to-ri-i. In IPA we write ii ee aa oo uu.

A nice minimal pair for monomoraic i vs. bimoraic ii is oˌdʑisaɴ ojisan ‘uncle’ vs. oˈdʑiisaɴ ojiisan ‘grandfather’. (Because of the accent difference, it’s actually only a near-minimal pair, in Tokyo pronunciation at least.)

Monday 13 December 2010

Laocoön

I hope it’s not going to be too boring if I again mention people’s increasing ignorance of how to pronounce classical names. Last week’s edition of the BBC television programme Have I Got News for You — which targets an intellectual rather than a popular audience — involved an odd-man-out question in which one of the candidates was Laocoön. Classicists and, I imagine, art historians know that this name is traditionally pronounced in English as leɪˈɒkəʊɒn. Everyone on HIGNFY called it læˈkəʊən. This does not even correspond to the spelling, since it ignores the o of Lao-. But they all said it, so I suppose the producer must have told them to.

The reason that the cognoscenti say leɪˈɒkəʊɒn is the usual boring one involving historical developments to do with Greek vowel length and the Latin stress rule, feeding into the English Great Vowel Shift. The Greek Λαοκόων laokóōn has a short penultimate vowel, making a light penultimate syllable, causing the Latin stress to fall on the antepenultimate and the English stress likewise. As in chaos, Greek χάος kháos, English lengthens the prevocalic short a, which duly emerges from the GVS as eɪ. Same with the penultimate short o, which turns into əʊ. As we say nowadays, simples.

If you know the ‘rules’, i.e. the historical principles underlying the traditional English pronunciation of classical names, leɪˈɒkəʊɒn is absolutely predictable and unremarkable. However, while knowledge of these rules is certainly useful for the author of a pronunciation dictionary, I can see the force of the argument that not everyone needs to know them.

Knowledge is a tricky thing. I spent my working life in a university environment where people generally felt an obligation to know the correct pronunciation not only of classical names but also of anything from French, German, Italian or Spanish. This doesn’t necessarily apply elsewhere.

On the tube I overheard a couple discussing buying a new washing machine. One of the brands they were considering was Miele. They called it miːɫ.

But I can’t escape from the fact that I know German and therefore think of this brand as ˈmiːlə. I know that vice versa has four syllables, not three. I know that Giotto has two syllables, not three. And I know that the letter z in the Brothers Karamazov is not like the z in Mozart. Millions don’t. Rant over. Perhaps I ought to get out more.

Friday 10 December 2010

wintry

The second sleep-inhibiting pronunciation (yesterday’s blog) turned out not to be as odd as I at first assumed.

The shipping forecast, with its hypnotic litany warning of gales in Viking, North Utsire, South Utsire, Forties, Cromarty, Dogger, Fisher, German Bight… — the Norwegians spell the name of their island Utsira, which is also what you will find in LPD, rather than Utsire — was well under way when the announcer started forecasting ˈwɪntəri showers. Hold on a minute, ˈwɪntəri? Surely this word is spelt wintry, so you would no more pronounce it ˈwɪntəri than you would pronounce angry as ˈæŋɡəri? We say ˈwɪntri, don’t we?

The point at issue is the treatment of fossilized, lexicalized cases of compression. Recall that among the candidates for compression (= loss of a syllable) are those sequences where ə is followed by r or l plus a weak vowel. The schwa can be lost (arguably via an intermediate stage involving a syllabic consonant, but we can ignore that here), reducing by one the number of syllables in the word.

So, for example, we have the option of saying ˈdʒenrəl rather than ˈdʒenərəl general, or ˈpræktɪkli rather than ˈpræktɪkəli practically.

The compression rule is, however, very variable in its application.

Americans don’t apply it as much as Brits do: so, for example, federal seems usually to be ˈfedərəl in AmE but ˈfedrəl in BrE.
In many words it is variable. You can say history as ˈhɪstəri or as ˈhɪstri, or at least I can.
Some words that meet its structural description nevertheless resist it. So cookery remains ˈkʊkəri in isolation (though you might get ˈkʊkri in BrE cookery book).
Some words undergo compression always, or nearly always. We say ˈevri every not ^?*ˈevəri. Do you ever pronounce separately as four syllables? I don’t think I do. What about basically?
There are some words in which compression is so well established that it is shown in spelling. In pronunciation it is obligatory. A case in point is angry, mentioned above, which morphologically and historically is obviously anger plus -y. Another is remembrance, clearly remember plus -ance, but pronounced as three syllables not four. Yet another is simply, the disyllabic output of disyllabic simple plus -ly.

My half-asleep assumption was that wintry is like angry and simply, i.e. involving a lexicalized, and therefore categorical, compression. If so, pronouncing it as three syllables would be as inappropriate as saying -ˈmembərəns instead of -ˈmembrəns in remembrance. You might hear the uncompressed version from the uneducated, but surely not from a radio announcer. The principle involved would seem to be that if a compression is shown in the spelling it is obligatory; if not, not.

It turns out, though, that dictionaries reckon I’m wrong. The Concise Oxford and LDOCE both show wintry with the alternative spelling wintery, with the possibility of a trisyllabic pronunciation. So that’s alright (or all right), then.

For what it’s worth, the OED has the spelling wintry from Spencer (1590) onwards, but wintery only from the nineteenth century.

Thursday 9 December 2010

Potidaea

As I prepare to drift off to sleep at night my bedside radio is tuned to BBC R4. After the midnight news comes the regular soothing sequence of Book of the Week, then Sailing By, then the shipping forecast… I’m usually deep in slumber by now.

Except when I am jerked awake by an interesting or unusual pronunciation. It happened twice last night.

The current Book of the Week is The Hemlock Cup, written and read by Bettany Hughes. It is a historical account of Socrates’s life in ancient Athens. Last night we heard how the philosopher, carrying out his citizen’s duty as a hoplite in the Athenian army, fought at the battle of Potidaea (Ancient Greek Ποτίδαια). Ms Hughes pronounced this name, several times, as ˌpɒtɪˈdeɪə.

But when I was at school it was ˌpɒtɪˈdiːə. Agh! the creeping non-classical-Greek, non-classical-Latin, non-English rendering of Greek αι, Latin ae as eɪ instead of classical aɪ or English post-Great-Vowel-Shift iː again! We’re used to this in vertebrae by now, but how far is it going to go? Are we going to start calling Caesar ˈseɪzə? Bettany Hughes is a Visiting Research Fellow at King’s College London, part of the University of London, and well known as a historian, author and broadcaster. How does she pronounce Aegean, Aesop, Mycenae, Thermopylae? I only ask.

OK, non-classical words are different: Disraeli, Gaelic, maelstrom have eɪ.

I’ll keep the second sleep-inhibiting shock for tomorrow.

Wednesday 8 December 2010

the myth of maɨθ

Commenting on yesterday’s blog, the prolific Anonymous complained

You know, if you listen to the Queen (plenty of clips on YouTube), she does not pronounce "about" as abite. Where did this idea ever come from? I've even watched the old videos and she didn't do it then either.

Strange, isn’t it? Yet an awful lot of people seem convinced that both she and her son Charles do just that. Here’s Steve Bell again, in today’s Guardian.

A kynecil hice in the grinds (a council house in the grounds)? Really?

YouTube can be our witness that Charles’s MOUTH vowel is usually pretty unremarkable. Here he is

when young, in 1969. Notice how he says thousand at 1:24 and down, round, out at 1:32-1:39; and
just a few years ago, in 2006. Listen for out at 0:43 and 0:50, then a thousand pounds at 1:17.

In Accents of English (p. 292) I said

In a quick search I haven’t been able to come up with a clip in which the Queen or Charles utter a genuine aɨ.

Perhaps they have been laughed out of it. A more likely scenario might be that they tend to pronounce aɨ just very occasionally, say one time in thirty or forty — and the popular stereotype has seized on this rare variant as being typical and pervasive.

Tuesday 7 December 2010

t-to-r

Our observations about the Guardian cartoonist Steve Bell’s confusion of Teesside and Tyneside (blog, 2 Dec.) have not gone unnoticed.

(Click cartoon to enlarge.)

“Knoowas” for ‘knows’ represents the northern east-coast opening diphthong for GOAT, ʊɔ.

“Shooroop” for ‘shut up’ represents ʃʊrʊp. This has not only the familiar northern use of ʊ in STRUT words, but also the outcome of what I call the “t-to-r rule” (AofE p. 370).

I don’t think there has been much discussion of this process in the literature. My impression is that it extends from somewhere in the English midlands (Coventry or thereabouts) up to the Scottish border and that it is always stigmatized. Unlike American t-voicing (tapping, ‘flapping’), it operates only after short vowels.

It features in Cilla Black’s catchphrase a lorra lorra laffs ‘a lot of laughs’.

Monday 6 December 2010

Indian English

When my Accents of English came out nearly thirty years ago, more than one person told me they thought that the last chapter (ch. 9, ‘The Imperial Legacy’) was the weakest. So I was interested to see the publication of a new book in the EUP Dialects of English series, entitled Indian English. It is by Pingali Sailaja of the University of Hyderabad, and contains a substantial chapter on pronunciation.

Sailaja makes a useful distinction between Standard Indian English Pronunciation (SIEP) and various kinds of non-standard pronunciation strongly influenced by the phonetics of the speaker’s L1. On the question of rhoticity, for example, where I wrongly said that “most speakers have a more or less fully rhotic pronunciation” (p. 629), this author asserts that SIEP is non-rhotic, but “most non-standard varieties of IE are rhotic”, although “there are those whose speech would be somewhere in the middle of the cline but they may still have non-rhotic speech” (p. 20). So he she sees nonrhoticity as standard, but rhoticity as a strong marker of non-standard speech.

In Indian English p t k are well-known to be unaspirated. With the exception of Tamil, Indian languages have a phonemic opposition between aspirated and unaspirated plosives. This is exploited for the voiceless th sound: we regularly get aspirated t̪ʰ where other varieties of English have θ. (Indian languages have no dental fricatives, nor usually does SIEP: “the sound /θ/ is sometimes articulated in SIEP but /ð/ is almost completely missing”.) If you hear a speaker of SIEP pronounce thing as t̪ɪŋ rather than as the usual Indian t̪ʰɪŋ, that probably means he or she is a speaker of Tamil.

Sailaja points out that if the aspiration of voiceless th were due purely to the influence of spelling we might expect d̪ɦ as the counterpart of ð: but in words such as this, mother, bathe we in fact get plain d̪. Then again, though, we do get ɦ in ghastly, ghost and also in John and sometimes in why wɦaɪ, which must certainly be due to the spelling.

Our stereotype of Indian English (or my stereotype, at least) has v and w merged as ʋ, with no distinction between vet and wet. Sailaja asserts, however, that “the distinction is maintained in SIEP”. He She thinks of this as a spelling-based distinction, which is reasonable — though it’s not how we core English speakers think of it, and not how it arose historically. Non-standard varieties of IE do not maintain the distinction.

The advertisement for a recent Hindi film that says ‘villager, visionary, winner’ is obviously meant to be alliterative.

And so finally to linking and intrusive r. The Indians are different from the English.

Friday 3 December 2010

comment is free

Today’s Guardian newspaper carries a feature headed “After the disclosure that Tories were worried by George Osborne’s voice, readers discuss how their own accents were shaped”.

Although we are told “To see these articles in full and to join the debate go to guardian.co.uk/commentisfree” I have not yet been able to find them on the website, which does nevertheless contain hundreds of readers’ gripes about this or that pronunciation (or in some cases, this or that vocabulary item or choice of words, which people seem to find difficult to distinguish from pronunciation).

With no access to phonetic symbols and probably no knowledge of phonetic transcription, contributors have some difficulty in explaining exactly what accent features they are referring to. They also fail to recognize that every accent contains within itself stylistic variants appropriate for different circumstances. (You can speak formal RP, or colloquial RP, or something between the two. Same with Geordie, Scouse, Mancunian or Brummie.)

So Simon Gilman says

In the 1960s, there were two accents common to our family. One we used among one another [sic], and the other we used with friends, colleagues, on the telephone and even to some relatives. These were two versions of the very same accent: received pronunciation, RP, once known as “BBC English”. The public version was akin to the accent you hear in British films of the 1940s, or as spoken by the royal family — my sisters my sisters, my brother and I would hoot when our mother answered the phone with “Air, hair lair”) (“oh – hello”). … My own London living has changed my RP to something even less extreme, following the “Estuarine” version exemplified by Tony Blair’s “the peopoo’s princess”.

By “the royal family” he must mean the Queen. Comparing her speech successively with that of Prince Charles, Princess Di, and Prince Harry, you see what a long way the royal family’s pronunciation has moved over the years.

Sean David Usher wrote

My accent is a Sunderland one, often referred to as mackem: to most people in the south it sounds like geordie (the Newcastle accent) but it is different. Some of the noticeable differences in pronunciation can be heard in these words: film pronounced “fillem”, school pronounced “schoo-el” and town with the emphasis on the “ow”. You know is “ya nar”, and a common phrase is “canny for a lad” — which is my dad’s answer to any request about his health and wellbeing. …

As for what he means by

town with the emphasis on the “ow”

your guess is as good as mine.

Andrew Dunn says

Originally from Glossop, I inherited my accent from family, friends and the wider area. It’s not quite Mancunian, yet not Yorkshire either… At university in Edinburgh I was pigeonholed as one of the “acceptable” English — it was Thatcher and the long-vowelled people from the south who were the enemy. … My accent became increasingly “Scotticised”, as a result of most of my friends being Scots. I found myself using lowlands vernacular more and more often, ken? …

Thursday 2 December 2010

Geordie royalty

The Guardian cartoonist Steve Bell is currently exploring the pretty conceit that our royal family have a secret double life as a stereotypically lower-class couple. Here they are as Cockneys, discussing whether to change the name of the House of Windsor. (full size here)

Showing pronunciation through ad hoc misspelling is always a bit hit-or-miss. The Cockney MOUTH vowel may be a monophthongal [aː], but it is never [ɑː], so that (as far as I know) ’ouse is not a genuine homophone of arse.

Yesterday they decided to move to the northeast of England and become the “Ahse of Teesside” — though I must say this looks more like Tyneside (Newcastle) than Teesside (Middlesbrough). (full size here)

The sporadic pronunciation of the GOAT vowel as [ɵː], which sounds passably like RP NURSE, is a striking characteristic of a Northumbrian accent. (Only a few days ago a phonetic friend of mine who lives in Morpeth was joking about the heavy “snur” that has fallen.)

The Tyneside accent, aka Geordie, has been in the news recently because of the singer and television personality Cheryl Cole. She is one of the judges on the wildly popular talent show The X Factor, and there is talk now of an American version of the show. But will Americans be able to cope with her pronunciation of English?

From her bouncy, shining mane to the over-sincere pep talks she doles out to her “girls”, Cheryl Cole seems a perfect fit for US television. She has glamour, style and empathy – all the qualities an American audience can understand.
Until, perhaps, she opens her mouth. Cheryl's Geordie accent may be celebrated (in a way) in the UK – this Christmas brings the book Woath It? Coase Ah Am, Pet by Twitter's @CherylKerl – but there are worries that some of the X Factor judge’s pearls of wisdom might get slightly lost in translation in America. … Currently it's Vernon Kay’s broad Bolton burr that is mystifying the Americans – viewers of ABC’s Skating with the Stars have complained that he is difficult to understand. And that despite American viewers having years and years of Daphne’s faux “Manchester” accent in Frasier.

If you don’t know what Cheryl sounds like and would like to know, here she is being interviewed by Piers Morgan.

Will American audiences be fazed by such things as drɔːr ə lɛɪn draw a line, fɛɪnd ðə tɛɪm find the time, kɑːnt koʊp wɪð ɪt (very back ɑː, caricatured as corn't in the cartoon) can’t cope with it, jə skrʊfs ən jər ʊɡz your scruffs and your Uggs? Or by the frequent low-accent-high-level-tail declarative intonation pattern?

Women _ have ¯a hard time of it