Beginners tend to look at English phonetics as something static. Each word has its pronunciation. To pronounce a sentence you just string the individual words together, each with its own pronunciation.
But as you move on from the beginner’s stage you realize that this is not enough. Phonetics is dynamic. Connected speech is more than isolated words strung together. To start with, we obviously have to generate an appropriate intonation pattern for the sentence. To do this we have to make decisions not only about tone — arguably, this is one of the least important things we have to decide — but also about which of the lexically stressed syllables to accent. Within each intonation phrase we must choose a syllable to bear the nuclear accent, and possibly other syllables to bear other accents.
But segmental structure, too, is dynamic. We have to pay attention to the phonetic processes of connected speech: form-word weakening, linking, assimilation, elision, syllabic consonant formation, compression. Each of these processes is constrained by the phonetic context, as well as by considerations such as speech rate and degree of formality.
I heard a BBC newsreader recently referring to ˈfiːfrəˈfɪʃl̩z, i.e. FIFA officials. This short phrase is a good example of some of these processes in action.
The citation form or dictionary pronunciation, which mild generativists might call the underlying representation, is
In non-rhotic varieties such as RP this phrase is a prime candidate for r insertion. We have a word ending in schwa closely followed by a word beginning with a vowel sound, which is a phonetic environment in which r insertion is likely. So we get
Any instance of a syllabic consonant that is immediately followed by a weak vowel — as for example the syllabic r that we have here — is a candidate for compression, whereby it loses its syllabicity. So we get
FIFA ends up as a monosyllable, with a final cluster not allowed in a word in isolation.