Wednesday, 9 June 2010

opera and beatbox

Here is an interesting MRI video of the organs of speech at work, produced by scientists at the Magnetic Resonance Engineering Laboratory of the University of South California.

It shows two performers in action.

One is an opera singer doing her thing (O mio babbino caro) and then demonstrating different vowel sounds at different pitches. We see how the vowel space appears to shrink with rising pitch.

The other is a beatbox emcee making what the commentary calls percussive sounds. These seem to be more accurately stops made mainly with various non-pulmonic airstream mechanisms. It is difficult to identify them with certainty, but they mostly seem to be clicks and reverse-clicks, i.e. made with an ingressive or egressive velaric airstream; some are probably implosives or ejectives, i.e. made with an ingressive or egressive glottalic airstream.

Listen and repeat after me. (The account given here really doesn’t help much.)


  1. As I read it, ̂k and k are not intended to represent different sounds. The former is an instruction: how to make more or less the same sound as the latter when the musical line hasn't left you enough breath to do it egressively.

  2. Extremely interesting videos!

  3. They don't mention the frame rate (images per second). A low frame rate is the current drawback to using MRI for articulation movement studies, like the first X-ray motion films yhat only managed 16 or 48 frames per second (say three or ten images per syllable repsectively), and brief events could easily be missed. You need at least 75 to 100 images per second for acceptable temporal resolution.

    If you watched the video, you will have seen comments below, including one from Dr Christine Eriksdotter at Stockholm University with links to her lab web pages.

    In case the penny didn't drop for everyone about sopranos singing vowels at high pitches, their difficulty has to do with the high voice fundamental frequency being above the first vocal tract resonance (or formant, F1). The first formant for a woman usually ranges between about 400Hz and 1000Hz from [i] or [u] to [a]. Singing [i] on high A (nearly 900Hz fundamental, then harmonics around 1800, 2700, 3600 etc Hz) would effectively filter it out (her fundamental would land between F1 and F2 where it would be extinguished, leaving her with a sore throat before she even got started). The trick is to change the vowel according to the note to be sung, so that the voice fundamental is planted squarely in F1. So all you see and hear on the video is one long [a] at the top of her voice range.