At the Łódź conference John Coleman presented an interesting talk about the spoken component of the British National Corpus. It comprises about ten percent of the entire corpus.
It includes a wide range of authentic spoken material, recorded in 1991-92 by volunteers wearing Walkman devices recording all their conversational interactions over a 24-hour period. As well as all kinds of structured and unstructured talk directed at other people, from sermons to discussions of boyfriends, the files include dog-directed and parrot-directed speech. Who’s a pretty boy, then?
The material has now been digitized by the British Library from the original analogue recordings.
Although comprising only ten percent of the whole corpus, the audio material of the BNC extends to 9 TB (nine terabytes), about 1800 hours’ worth. So you won’t be downloading it all and storing it on your hard disc any time soon.
Although the whole spoken corpus is unmanageably large, a selection of audio files from the BNC is now available online.
The ten most frequently used words in the spoken corpus, Coleman says, occur more than 58,000 times each. At the other extreme, 23% of the words used (12,400 words) occur only once. Many other words that are surely in people’s vocabulary never occur at all.
Coleman presented some observations about assimilation of place of articulation. As well as the familiar dealveolar type (ˈtem ˈmɪnɪts, ˈɡʊɡ ˈɡɜːl), he found various instances of “nonstandard place assimilation of word-final /m/ and /ŋ/”. Delabial examples included siːn in seem to and seɪŋ in same kind of. As well as plenty of cases of aɪŋ(ɡ)ənə etc for I’m going to, he reports “18 tokens per 10 million” of əˈlɑːŋ klɒk for alarm clock. The most frequent item classified as develar was swimming pool pronounced as ˈswɪmɪm puːl — but there of course the underlying form of the -ing ending would be ɪn rather than ɪŋ for some speakers in some styles of speech (as the sociolinguists have documented), so that the assimilation could be dealveolar after all, not develar. The same applies to ˈwedɪm in wedding present.
We await further reports with interest.
I've finally discovered my copy of Gillian Brown's Listening to Spoken English. She gives a number of similar examples from data collected before 1976. For example:
ReplyDelete/əˈmaʊntbaɪ/ [əˈmaʊmʔpbaɪ] amount by
/ˈbændfəˈlaɪf/ [ˈbæmbfəˈlaɪf] banned for life
But those are "standard" (dealveolar) cases, David.
ReplyDeleteJohn
ReplyDeleteAh, I see.
But one of her examples is [aɪŋˈgɜŋ] I'm going.
I see my link didn't work, so I'll try again. There's a description at this Amazon page:
ReplyDeleteListening to Spoken English.
Among these kinds of known diachronic supplements for linguistic treatments, i do not know what else she has as far as meaningful linguistic analyses are concerned. But we know that English vowels, unlike other languages, are arguably not marked for stress but for syllables. So an unstressed syllable at initial is also technically part of the syllable unless the unstressed is syllabic. So why do we mark a primary stress after the unstressed vowel if it is actually belonged to a syllable, even if the schwa is more or less of its varying quality?
ReplyDeleteIf 1800 hours of audio occupies 9 TB, its mean bitrate is 12216.8 kbps, nearly 9 times that used on CDs (1411.2 kbps). This seems needlessly high for speech transferred from recordings in other media.
ReplyDeleteOur copy is way less than 9 TB, but the British Library Sound Archive uses a standard for library/archive sound recordings - 24 bit, 96 kHz - that is indeed a needlessly high bitrate for digitized compact cassette recordings. But standards is standards ...
DeleteFor the BNC audio sampler at http://www.phon.ox.ac.uk/SpokenBNC, we use 16 bit, mono files at 16 kHz.
John Coleman
My post should have actually shown like this but that's fine--
ReplyDeleteAmong these kinds of known diachronic supplements for linguistic treatments, i do not know what else she has as far as meaningful linguistic analyses are concerned. But we know that English vowels, unlike other languages, are arguably not marked for stress but a syllable (or two) is. So an unstressed vowel at initial is also technically part of its adjacent syllable as a +body or -mora (or whatever pleases) unless the unstressed one is syllabic. So why do we mark a primary stress after the unstressed vowel if it is actually belonged to a syllable, even if the schwa is more or less of its varying quality?
vizer tv apk
ReplyDeletelucky patcher
gbwhatsapp apk
gbwhatsapp
ac market apk
ac market
whatsapp plus apk
happy chick apk
Live NetTV APK
mx player apk
dj liker apk
tubi tv apk
ReplyDeletetutuapp vip
tutuapp apk
tutu app apk
tutuapp install
tutuapp download
The app comes with simplicity, anyone can start using this app or we can say platform to get the best gaming experience.
ReplyDeleteIn love with this post.thankyou for the information.
Please do find the attached files and download it form our website.
https://prabhu1808.tumblr.com/
https://acmarket1808.home.blog/
https://prabhu1808.livejournal.com/373.html
https://prabhupunter1808.wixsite.com/mysite-1
https://medium.com/@prabhupunter1808/live-nettv-3377b23bb852
https://acmarketapk1808.weebly.com/
http://prabhu.ampblogs.com/live-net-tv-23642914
https://happychickapk3.home.blog/2019/05/04/happy-chick/
ReplyDeletehankyou for the information.
Please do find the attached
http://www.robbiramdhani.web.id/
This comment has been removed by the author.
ReplyDeleteDownload Nintendo 3DS Emulator for PC
ReplyDeleteThis Is Really Great Work.