Teaching English Intonation to EFL/ESL Students

Mehmet Celik
mcelik [at]
Hacettepe University, Turkey
This article proposes a workable, teachable, generalisable as well as communicatively efficient framework for the teaching of the intonation of English to non-native speakers of English. It is proposed that a framework of English intonation should include four major intonational features: intonation units, stress, tones, and pitch range. Consequently, the phenomena of intonation in English should have a piece of utterance, intonation unit, as its basis to study all kinds of voice movements and features. Every intonation unit has a type of tonic stress: (unmarked) utterance-final tonic stress, or emphatic, or contrastive, or new information stress, the last of which is more frequently used in utterances given to wh-questions. Further, intonation units have typically one of these tones; fall, low-rise, high-rise, and fall-rise. Tones are assigned to intonation units in relation to the type of voice movement on the tonic syllable. Finally, all intonation units have to be spoken in one of the three pitch levels (keys): high, mid, and low.


At a time when the language learning task is geared to instant interpersonal communication with efficiency and precision, the intonation phenomena could not have gone unnoticed in the preparation of English teaching syllabuses in the threshold of a new millennium. What to include and what not to in the teaching of intonation to learners of English as a second/foreign language (ESL/EFL) has caused uncertainty and lack of confidence, and consequently ignoring of the intonation in syllabuses to a great extent, which is, as Underhill (1994:75) rightly notes, because '...we are not in control of a practical, workable and trustworthy system through which we can make intonation comprehensible.'

A major feature of communication, suprasegmental (prosodic) features of speech have usually been avoided in the design of syllabuses for teaching English, partly due to the unduly little importance attached to the teaching of them, and partly due to the unavailability of a concise, salient, practical, and workable framework (Underhill, 1994:47; Kenworthy, 1987). There are some attempts, of course, to come up with a scheme that is practical. However, they usually concentrate on certain areas of intonation rather than embracing the whole phenomenon of intonation (Coulthard, 1977; Underhill, 1994; Levis, 1999). Levis (1999), for instance, falls short of providing a coherent scheme by which foreign language teachers can utilize in their syllabuses for improving oral skills; it studies, in passing, intonational features such as significant pitch, pitch levels, intonation patterns, and placement of nuclear stress.

For Cruttenden (1986:35), intonation has three important features: 1) : division of a (dividing) a stream of speech into intonation units, 2) selection of a syllable (of a word), which is assigned the 'tonic' status, and 3) selection of a tone for the intonation unit To this list, another feature can be added: pitch range, or key (Brazil et al., 1980). In the experience of the present author in teaching oral skills to prospective teachers of English as a second/foreign language, a conception incorporating these four major features of intonation in the teaching syllabus has efficiently worked and proved very useful. This system, it is believed, may prove to be useful for other practitioners in the field of ESL/EFL.

This article explains the four major features in the teaching of English suprasegmentals: intonation units, stress, tone, and pitch range by reviewing relevant and current research. As such, this article provides a framework of English intonation for the teaching of English as a second/foreign language. What the framework proposes is primarily based on what is most salient in the more recent scholarly studies of intonation phenomena, and secondarily, on what can be teachable given the author's own experience in the teaching of the phenomena. Later, the need to teach intonational features in meaningful contexts with realistic language rather than fabricated language as well as the need to consider intonation, not as a luxury but a necessity for an efficient interchange in English is pointed out.

Intonation Units

An 'intonation unit' is a piece of utterance, a continuous stream of sounds, bounded by a fairly perceptible pause. Pausing in some sense is a way of packaging the information such that the lexical items put together in an intonation unit form certain psychological and lexic~grammatical realities. Typical examples would be the inclusion of subordinate clauses and prepositional phrases in intonation units.

It is proposed here that any feature of intonation should be analyzed and discussed against a background of this phenomenon: tonic stress placement, choke of tones and keys are applicable to almost all intonation units. Closely related with the notion of pausing is that a change of meaning may be brought about; certain pauses in a stream of speech can have significant meaning variations in the message to be conveyed. Consider the example below, in which slashes correspond to pauses (Roach, 1983:146) (see Halliday, 1967; Leech & Svartvik, 1975 for more): the meaning is given in brackets.

More examples can be used in order to illustrate the significance of pausing, and further, it can be pointed out that right pausing may become a necessity to understand and to be understood well.


This section addresses the notion of stress in words as perceived in connected speech. In addition, the existence and discovery of tonic stress is discussed, and the major types of stress are explicated. Four major types of stress are identified: An important prosodic feature, 'stress' applies to individual syllables, and involves, most commonly, loudness, length, and higher pitch (Roach, 1983:73). Each of these features may contribute in differing degrees at different times. Stress is an essential feature of word identity in English (Kenworthy, 1987:18). It is evident that not all syllables of a polysyllabic English word receive the same level of stress; in connected speech, usually two levels of stress appear to be perceptible, to non-native speakers in particular, regardless of the number of syllables: stressed and unstressed (Ladefoged, 1973; Kenworthy, 1987). What is known as the primary stress is regarded as the stressed syllable while the rest, secondary, tertiary, and weak, are rendered as unstressed syllables.

At the clausal level, normally, words that carry higher information content in the utterance are given higher stress than those carrying lower input (information) and those that are predictable in the context. It is generally the case that one word is stressed more than any other since it possesses the highest information content for the discourse utterance, that is, it informs the hearer most. The group of words described above are largely from what is called 'content' words as opposed to 'function' words. Content words are nouns, verbs, adjectives, and adverbs while function words are articles, prepositions, conjunctions, and modal auxiliaries. Furthermore, it is content words that are polysyllabic, not function words. This classification conforms to grammatical considerations. The classification we present here from a suprasegmental viewpoint, that is on the basis of being stressed or not, is slightly different from that of grammar. Consider the following:

Content/Stressed Words Function/Unstressed Words
verbs modal auxiliaries
nouns articles
adjectives conjunctions
adverbs prepositions
question words pronouns
prepositional adverbs

In other words, the items on the left hand column are stressable in unmarked utterances whereas the ones on the right column are not.

Tonic Stress

An intonation unit almost always has one peak of stress, which is called 'tonic stress', or 'nucleus'. Because stress applies to syllables, the syllable that receives the tonic stress is called 'tonic syllable'. The term tonic stress is usually preferred to refer to this kind of stress in referring, proclaiming, and reporting utterances. Tonic stress is almost always found in a content word in utterance final position. Consider the following, in which the tonic syllable is underlined: A question does arise as to what happens to the previously tonic assigned syllables. They still get stressed, however, not as much as the tonic syllable, producing a three level stress for utterances. Then, the following is arrived at., where the tonic syllable is further capitalized:

Emphatic Stress

One reason to move the tonic stress from its utterance final position is to assign an emphasis to a content word, which is usually a modal auxiliary, an intensifier, an adverb, etc. Compare the following examples. The first two examples are adapted from. Roach (1983:144).
i. It was very BOring. (unmarked)
ii. It was VEry boring. (emphatic)

i. You mustn't talk so LOUDly. (unmarked)
ii. You MUSTN'T talk so loudly. (emphatic)

Some intensifying adverbs and modifiers (or their derivatives) that are emphatic by nature are (Leech & Svartvik, 1.975:135):
indeed, utterly, absolute, terrific, tremendous, awfully, terribly, great, grand, really, definitely, truly, literally, extremely, surely, completely, barely, entirely, very (adverb), very (adjective), quite, too, enough, pretty, far, especially, alone, only, own, -self.

Contrastive Stress

In contrastive contexts, the stress pattern is quite different from the emphatic and non-emphatic stresses in that any lexical item in an utterance can receive the tonic stress provided that the contrastively stressed item can be contrastable in that universe of speech. No distinction exists between content and function words regarding this. The contrasted item receives the tonic stress provided that it is contrastive with some lexical element (notion.) in the stimulus utterance. Syllables that are normally stressed in the utterance almost always get the same treatment they do in non-emphatic contexts. Consider the following examples:
a) Do you like this one or THAT one?
b) I like THIS one.
Many other larger contrastive contexts (dialogues) can be found or worked out, or even selected from literary works for a study of contrastive stress. Consider the following:
  • She played the piano yesterday. (It was her who...)
  • She played the piano yesterday. (She only played (not. harmed) ...)
  • She played the piano yesterday. (It was the piano that...)
  • She played the piano yesterday. (It was yesterday...)
  • New Information Stress

    In a response given to a wh-question, the information supplied, naturally enough, is stressed,. That is, it is pronounced with more breath force, since it is more prominent against a background given information in the question. The concept of new information is much clearer to students of English in responses to wh-questions than in declarative statements. Therefore, it is best to start with teaching the stressing of the new information supplied to questions with a question word:
    a) What's your NAME
    b) My name's GEORGE.

    a) Where are you FROM?
    b) I'm from WALES.

    a) Where do you LIVE
    b) I live in BONN

    a) When does the school term END
    b) It ends in MAY.

    a) What do you DO
    b) I'm a STUdent.

    The questions given above could also be answered in short form except for the last one, in which case the answers are: In other words, 'given' information is omitted, not repeated. In the exchange:
    a) What's your name?
    b) (My name's) George.
    The 'new' information in this response is 'George.' The part referring to his name is given in the question, so it may be omitted.

    Regarding the significance of new information declarative statements, Ladefoged (1982:100) states:

    'In general, new information is more likely to receive a tonic accent than material that has already been mentioned. The topic of a sentence is less likely to receive the tonic accent than the comment that is made on the topic.'
    Furthermore, Bolinger (1968:603) notes that speakers '...depend on stress to highlight the most important and informative idea in the sentence.' (the italics is original). I think that Bolinger's 'the most important and informative idea' coincides with the concept of 'new information'. So the stressed lexical item is that which carries the information enveloping communicative intent and purpose. The information in the stressed item is the core of the message within the utterance. Therefore, it is the most important element in the utterance. Consider the following example taken from Dickerson (1989:20, cited in Levis, 1999:45):
    a) It sounds like there was some excitement last night.
    b) Didn't you hear? There was a torNAdo in the area.
    Here in this example, the most prominent information appears to be stored in 'tornado' rather than the last content word in the utterance, as expected according to the guidelines given in 2.1 above.


    A unit of speech bounded by pauses has movement, of music and rhythm, associated with the pitch of voice (Roach, 1983:113). This certain pattern of voice movement is called 'tone'. A tone is a certain pattern, not an arbitrary one, because it is meaningful in discourse. By means of tones, speakers signal whether to refer, proclaim, agree, disagree, question or hesitate, or indicate completion and continuation of turn-taking, in speech.

    Pointing to extensive variations in the taxonomy of English tones, Cruttenden (1986:58) rightly notes that 'This is an area where almost every analyst varies in his judgement of what constitutes a 'major difference of meaning' and hence in the number of nuclear tones which are set up.' He adds: '...intonational meanings are often so intangible and nebulous ... (that) it is difficult to see how a wholly convincing case for any one set of nuclear tones..' (parenthetical statement is mine). Crystal (1969) and Ladefoged (1982) identify four basic tones (fall, rise-fall, rise, and fail-rise) while O'Connor and Arnold (1973) distinguish only two (rise and fall). Brazil et al. (1980) and Roach (1983) endorse five tones (fall, rise, rise-fall, fall-rise, and level) whereas Cruttenden (1986) recognizes seven tones (high-fall, low-fall, high-rise, low-rise, fail-rise, rise-fall, and mid-level).

    It appeared in the author's teaching experience that only four types of tones can be efficiently taught to non-native speakers of English:

    What makes a tone a rising or failing or any other type of tone is the direction of the pitch movement on the last stressed (tonic) syllable (Brown, 1977:45). If the tonic syllable is in non-final position, the glide continues over the rest of the syllables. A fall in pitch on the tonic syllable renders the tone as 'fall'. A 'rise' tone is one in which the tonic syllable is the start of an upward glide of pitch. This glide is of two kinds; if the upward movement is higher, then it is 'high rise'; if it is lower, then it is 'low rise'. 'Fall-rise' has first a pitch fall and then a rise.

    Fall (A Falling Tone)

    A falling tone is by far the most common used tone of all. It signals a sense of finality, completion, belief in the content of the utterance, and so on. A speaker, by choosing a falling tone, also indicates to the addressee that that is all he has to say, and offers a chance (turn-taking) to the addressee to comment on, agree or disagree with, or add to his utterance.  However, it is up to the addressee to do either of these. This tone does in no way solicit a response from the addressee. Nonetheless, it would be polite for the addressee to at least acknowledge in some manner or form that he is part of the discourse. Now, let us see the areas in which a failing tone is used. The following is a proclamation in which a teacher is informing a student of the consequences of his unacceptable behavior.
  • I'll report you to the HEADmaster
  • A falling tone may be used in referring expressions as well.
  • I've spoken with the CLEAner.
  • Questions that begin with wh-questions are generally pronounced with a falling tone:
  • Where is the PENcil?
  • Imperative statements have a falling tone.
    i) Go and see a DOCtor.
    ii) Take a SEAT.
    Requests or orders have a falling tone too.
    i) Please sit DOWN
    ii) Call him IN.
  • Watch OUT!
  • Yes/No questions and tag questions seeking or expecting confirmation can be uttered with a falling tone. And the response to it may be lengthened. Consider the following example:
    a) You like it, DON'T you?
    b) YEES.
    In a Yes/No question structure, if the speaker uses a falling tone, we assume that he already knows the answer, or at least he is sure that he knows, and the purpose of asking the question, as far as the speaker is concerned, is to put the answer on record. In the following exchange, the speaker is sure to get a 'Yes' answer from the addressee:
    a) Have you MET him?
    b) YES.

    Low Rise (A Rising Tone)

    This tone is used in genuine 'Yes/No' questions where the speaker is sure that he does not know the answer, and that the addressee knows the answer. Such Yes/No questions are uttered with a rising tone. For instance, consider the following question uttered with a rising tone, the answer of which could be either of the three options:
    A) Isn't he NICE

    B) i) Yes.
    ii) No.
    iii) I don't know.

    Compare the above example with the following example, which is uttered with a falling tone, and which can only have one appropriate answer in the context:
    a) Isn't he NICE
    b) YES.
    Other examples which are uttered with a rising tone are:
  • Do you want some COFfee?
  • Do you take CREAM in your coffee?
  • High Rise (A Rising Tone)

    If the tonic stress is uttered with extra pitch height, as in the following intonation units, we may think that the speaker is asking for a repetition or clarification, or indicating disbelief.
    a) I'm taking up TAxidermy this autumn.
    b) Taking up WHAT? (clarification)

    a) She passed her DRIving test.
    b) She PASSED? (disbelief)

    Fall Rise (followed by Fall)

    While the three tones explicated so far can be used in independent, single intonation units, the fourth tone, fail-rise, appears to be generally used in what may be called 'dependent' intonation units such as those involving sentential adverbs, subordinate clauses, compound sentences, and so on. Fall-rise signals dependency, continuity, and non-finality (Cruttenden, 1986:102). It generally occurs in sentence non-final intonation units. Consider the following in which the former of the intonation units are uttered with a fall-rise tone (the slash indicates a pause):
  • Private enterPRISE / is always EFficient.
  • A quick tour of the CIty / would be NICE.
  • PreSUmably / he thinks he CAN.
  • Usually / he comes on SUNday.
  • One of the most frequent complex clause types in English is one that has dependent (adverbial or subordinate) clause followed by an independent (main) clause. When such a clause has two intonation units, the first, non-final, normally has a fall-rise while the second, final, has falling tone. Therefore, the tone observed in non-final intonation units can be said to have a 'dependency' tone, which is fall-rise (The explication of tone patterns as well as some of the examples in this section are largely based on Cruttenden, 1986). Consider the following:
  • When I passed my REAding test / I was VEry happy.
  • If you SEE him / give my MESsage.
  • When the order of complex clause is reversed, we may still observe the pattern fall-rise and fall respectively, as in
  • I WON'T deliver the goods / unless I receive the PAYment.
  • The moon revolves around the EARTH / as we ALLknow.
  • Private enterprise is always EFficient / whereas public ownership means INefficient.
  • All in all, final intonation units have a falling tone while non-final ones have fall-rise. Consider further complex clauses:
  • He joined the ARmy / and spent all his time in ALdershot.
  • My sister who is a NURSE / has ONE child.
  • This completes the four major tones selected for the framework. As is the case in this section, some of these tones can be used in combination when a syntactic unit (sentence) has more than one intonation unit. This section has reviewed the (fall-rise + fall) and (fall + fall-rise) patterns. In the following two sections, two patterns, namely (fall-rise + low rise) and (fall + fall), are examined respectively.

    Fall-rise + Low Rise

    Typically this tone pattern involves a dependent clause followed by a Yes/No question.
  • If I HELPED you / would you try aGAIN?
  • Despite its DRAWbacks / do you favor it or NOT?
  • Fall + Fall

    A fall tone can be followed by another fall tone when the speaker expects or demands agreement as in tag questions.
  • It's a bit TOO good to be true / ISN'T it?
  • Reinforcing adverbials can also have a fall when place utterance finally as an expression of after-thought.
  • Ann said she'd help as much as she COULD / NATUrally.
  • If the two actions are part of a sequence of related events, it has (fall + fall) tone pattern, as in the following in which the information in the first intonation unit and the one in the second one do not have dependency:
  • She's 28 years OLD / and lives in GiPPSland.
  • Pitch and Pitch Range (Key)

    Pitch is one of the acoustic correlates of stress (Underhill 1994:57). From a physiological point of view, '...pitch is primarily dependent on the rate of vibration of vocal cords... (Cruttenden, 1986:3). When the vocal cords are stretched, the pitch of voice increases. Pitch variations in speech are realized by the alteration of the tension of vocal cords (Ladefoged, 1982:226). The rate of vibration in vocal cords is increased by more air pressure from the lungs. In an overwhelming majority of syllables that are stressed, a higher pitch is observed. Therefore, loudness to a certain extent contributes to the make-up of pitch. That is, higher pitch is heard louder than lower pitch. Further, syllable length tends to contribute to the perception of the utterance-final tonic stress more than pitch because of the natural decline of speech force as it comes to conclusion, contrary to acoustic facts (Levis, 1999:42).

    The term 'key' can be described as utterance pitch; specific and/or meaningful sequences of pitches in an intonation unit. Keys that are linguistically meaningful and significant are worth being included in a syllabus. For a key to be significant, 1) it should be under speaker's control, 2) it should be perceptible to ordinary speakers, and 3) it should represent a contrast (Roach, 1983:113). Usually, three keys are identified: high, mid, and low (Coulthard, 1977; Brazil et al., 1980).

    For each intonation unit, speaker must choose one of the three keys as required for the conversation. Most of the speech for a speaker takes place at the mid (unmarked) key, employed in normal and unemotional speech. In contrast, high and low keys are marked: high key is used for emotionally charged intonation units while use of low key indicates an existence of equivalence (as in appositive expressions), and relatively less significant contribution to the speech. The relationship between pitch and key is a comparative one in that syllabic pitch is always higher than the utterance pitch; in some sense, syllabic pitch is one step ahead of the utterance pitch.

    High Key


    Exclamation is usually the cover term used to refer to actions described by verbs such as cry, scream, shout, wail, shriek, roar, yell, whoop, bellow, bark, thunder, howl, echo, and so on. Speakers do these to express their strong feelings such as excitement, surprise, anger, irritation, rage, fury, wrath, fume, agitation, cheer, merriment, gaiety, fun, etc. Speakers generally exploit high pitch when they exclaim.

    The extract '''Have you guessed?' he whispered at last. 'Oh God!' burst in a terrible wail from her breast.''' can be schematized as

    high She:                       oh GOD
    low  He: / have you GUESSED? /


    Another function of high pitch is to indicate contrastivity. Brazil et al. (1980:26) note the following: 'It is proposed as a general truth that the choice of high key presents the matter of the tone unit as if in the context of an existentially-valid opposition.' Consider the following adapted example, in which the word uttered with a high key has contrastive stress (Brazil et al., 1980:26):
    high                                         BOGnor /
    mid / we're going to MARgate this year / not
    In addition to the high key for Bognor, either referring (fall-rise) or proclaiming (fall) tone should be selected. Use of high key with referring tone indicates that the contrast was established prior to this utterance whereas a proclaiming tone reports what the two options are as part of the news. The following example, adapted from Pennington (1996:132), also illustrates the utilization of high key for contrast:
    high                              YALE /
    mid / I'm going to HARvard / not


    The act of echoing/repeating is almost always done with high pitch. It may involve a genuine attempt to recover unrecognized, unheard information, or to indicate disbelief, disappointment and so on. The tone to be utilized in such intonation units is high-rise. Consider the following exchange where a case of disbelief is in question:
    a) 'Four thousand,' said Barney sadly.
    b) 'Four thousand?' But it's just a shack!
    high B:                  four THOUsand
    mid                                   / but it's just a SHACK /
    low A: / four THOUsand /
    In the following examples, a repetition and/or clarification and disbelief is sought, respectively:
    a) I'm taking up taxidermy.
    b) Taking up what?
    high B:                             taking up WHAT /
    mid A: / I'm taking up TAxidermy /

    Low Key

    Co-reference, Appositives

    Lower pitch is used to indicate co-referential, additional or supplementary information. Consider the following example, in which the word dummy in low key is co-referential with you in mid key (Pennington, 1996:152):
    mid / I TOLD you already /
    low                        DUMmy /

    Non-defining Relative Clauses

    The type of information uttered in low pitch may be non-defining relative clauses, parenthetical statements expressions of dis/agreement, reduced clauses etc. Consider the following:
    mid / my DOCtor /                     / is very WELL-known /
    low               who's a neuROlogist

    Statements of Opinion

    There are times when short statements of opinion, involving clarification, certainty/uncertainty, are attached to propositional statements. Look at the examples below:
    mid / the COvernment /          / will agree with our deMANDS
    low                    I THINK


    This study has argued for the inclusion of intonational features of English in the syllabuses designed for the teaching of English as a second/foreign language, and provided a practical framework of English intonation, which is based on the present author's experiences. Intonation, the non-grammatical, non-lexical component of communication, is an inseparable component of utterances. Speech without intonational features is no more than a machine output. Intonation is a paralinguistic device in vocal communication. It reveals many facets of the communication process taking into consideration all factors present in the discourse context. Therefore, it is an indispensable part of speech. Tones are important discourse strategies to communicate effectively; simply, it is not what you say, it is how you say it. Therefore, a proficiency in intonation is a requirement for non-native learners of English for a better communicative discourse with native or non-native speakers of English.


