Filled Pause
Research Center

Filled Pause
Research Center

Filled Pause
Research Center

Investigating 'um' and 'uh' and other hesitation phenomena

Investigating 'um' and 'uh' and other hesitation phenomena

Investigating 'um' and 'uh' and other hesitation phenomena

Filled pauses in second language acquisition and pedagogy

Naturally, the study of filled pauses doesn't end with the study of native language. In fact, because learners of a second language must pass through interlanguage phases with varying degrees of fluency, the use of hesitation phenomena like filled pauses becomes a necessity and is a potentially richer area of study than in native language. This spotlight list highlights some work that has paid close attention to the use of filled pauses by learners.

As noted in another spotlight list, the study of filled pauses began in the late 1950s. It was not too long after that that second language teachers recognized the need to consider these phenomena in language pedagogy. The earliest call for such attention that I can find comes from Richard Leeson in “The Exploitation of Pauses and Hesitation Phenonema in Second Language Teaching: Some possible lines of exploration”[1] in 1970. About ten years later, Adolf Hieke, in “Aspects of native and non-native fluency skills”[2], laid out a view of second language fluency which gave special emphasis to the use of hesitation phenonema and filled pauses while Joan Fayer and Emily Krasinski looked at the effect of disfluencies in intercultural communication in “Native and Nonnative Judgments of Intelligibility and Irritation”[3]. More recent thoughts about filled pauses in second language teaching come from Caroline Rieger's “Disfluencies and hesitation strategies in oral L2 tests”[4] as well as Ralph Rose's (yes, that's me) “Filled Pauses in Language Teaching: Why and How”[5].

Over the decades since, numerous researchers have looked at filled pauses as features of second language speech and as components of second language fluency making interesting observations in a wide variety of areas. Tracey Derwing and colleagues' “The Relationship between L1 Fluency and L2 Fluency Development”[6] well establishes that there is some overlap between first and second language speech fluency behaviors -- an important theme that continues to be investigated in even recent works by Ralph Rose's (yes, me again) “A Comparison of Form and Temporal Characteristics of Filled Pauses in L1 Japanese and L2 English”[7] and Justin Lo's “Between Äh(m) and Euh(m): The Distribution and Realization of Filled Pauses in the Speech of German-French Simultaneous Bilinguals”[8].

Other interesting work on second language filled pauses comes from Michiko Watanabe and colleagues' “Filled pauses as cues to the complexity of upcoming phrases for native and non-native listeners”[9], an often-cited work in second language studies of filled pauses. Sandra Götz's book, “Fluency in Native and Nonnative English Speech”[10] is about more than just filled pauses, but covers them in great detail with interesting proposals for conceptualizing their use in second language speech behavior. Hans Rutger Bosker and colleagues' “Native 'um's elicit prediction of low-frequency referents, but non-native 'um's do not”[11] is a neat study to show an important difference in the perception of filled pauses between first and second language and Lorenzo García-Amaya's “A longitudinal study of filled pauses and silent pauses in second language speech”[12] shows how second language learners' use of pauses develops over time.

Of course, there are many other great works that could be listed here but in the interest of keeping the list short, I'll stop here. Researchers (and others) who are interested in the use of filled pauses by second language learners would do well to look at these works, and perhaps even more importantly, the works that each cites.


  • Richard Leeson, “The Exploitation of Pauses and Hesitation Phenonema in Second Language Teaching: Some possible lines of exploration,” Audiovisual Language Journal, vol. 8, no. 1, 1970, pp. 19-22.

    Abstract Identifies three types of pausing and discusses their relevance to language teaching.

  • Adolf E. Hieke, “Aspects of native and non-native fluency skills,” Master's Thesis, University of Kansas. 03/1980 1980, pp. 274.

    Abstract One measure of language competency is fluency, whether it be in the speaker’s first or second language, yet the concept of fluency has received little attention in linguistics so far. Research on second language acquisition has also neglected this complex issue although the attainment of fluency is the ultimate objective of most foreign language programs. In an article on testing oral fluency published no more than five years ago, only the following, rather general definition could be offered: Fluency: tentatively defined as the ability to give proof of sustained oral production implying a certain communicative competence, as well as the unstilted, spontaneous use of English "conversanal [sic] lubricants" (Beardsmore 1974:323). Investigations by psychologists and others have paved the way for an exploration of the potentials of a linguistic inquiry into fluency. There is a growing body of literature on the general subject as well as fairly sophisticated equipment to study components of fluency and hesitation phenomena. With the study of discourse becoming an area of growing interest to linguists, a focus on fluency promises to shed additional light on the nature of language and may off er valuable insights into language learning. Fluency must not be understood as a unified concept, as the only book-length treatment on teaching fluency (Leeson 1975) makes clear, because the phenomenon is highly complex and the issues attending it are multi-faceted. Therefore any individual study such as the present one must be selective in its focus and can make only a partial and modest contribution. In this case the concentration is on just one mode of speech, paraphrase, through which the more readily identifiable aspects of fluency lend themselves to investigation: rate of speech and a number of so-called hesitation phenomena, namely silent and filled pauses, repeats, false starts, and parenthetical remarks. All statements and figures relating to fluency and to hesitation phenomena throughout this study therefore pertain to spontaneous speech under the conditions of the paraphrase mode only. Although paraphrase is not spontaneous speech in the strictest sense, it preserves the essential characteristics of spontaneous speech and offers several advantages for the researcher concerned with fluency. It permits some control over content so that, for instance, deviations from the known story would become immediately apparent in the re-telling, which reveals something about the strategies in speech planning. More importantly, since this mode has been used in experimentation before, its adoption makes comparisons with other research findings possible. For teachers who wish to test oral fluency, the experimental results offered in the following can provide some normative data against which to gauge their own test results. Since the paraphrase skill is commonly used in oral fluency assessment, the experiments here have been set up in much the same way they would be found in a teaching program. That the results from the present study may be put to immediate practical use conforms exactly to the overall purpose here, which is above all to present research of practical value. Even the experimental design adopted here derives from the attempt to approximate as closely as possible an ordinary, realistic teaching situation. Consequently, all experiments were conducted as part of a normal, on-going teaching program and testing process in English as a Foreign Language (EFL) classes at the University of Tuebingen. Based on a number of experiments conducted between 1977 and 1979, the present study pursues three goals. First, it attempts to establish a range of baseline data for native English and German spontaneous speech; such data then serve to evaluate nonnative fluency skills in English, both before and after instruction of a type involving a new teaching technique; finally, it offers a reclassification of all the hesitation phenomena along criteria different from the traditional ones, which in turn leads to a re-analysis of the raw data. Chapter One provides a review of studies on crosslinguistic speech rate and hesitation phenomena. Chapter Two reports on experimentally derived baseline data on native speech rates and hesitation phenomena in both English and German, respectively, which amounts to a thorough investigation of the time continuum in spontaneous speech: rate; mean length and frequency of silent pauses; rate of filled pauses, repeats, false starts, and parenthetical remarks; articulation rate and length of runs. In this manner the time components can be accounted for in terms of silence, speech, and hesitation phenomena and the percentage of time each takes up. Chapter Three presents the rationale and function of Audio-Lectal Practice (ALP), a new teaching technique designed to facilitate fluency acquisition, in some detail. This sets the stage for Chapter Four which shows the effect of controlled, imitative practice in continuous speech on fluency skills, based on pre-tests and post-tests after a twelve-week exposure to ALP. An interpretation of the data provides information on fluency skills in both native and non-native speakers (in this case German university students learning English) as well as comparative values between these groups. With that the purposes of the study are fulfilled as far as normative and comparative measures are concerned, but the process of analyzing the wealth of data available (all in all, 78 speech samples of one minute each) forcefully suggested a reclassification of hesitation phenomena. Chapter Five thus turns from numerical findings to matters of classification. In addition to the traditional classification system, which handles the data in sequential fashion along a time axis, a point of view is possible which captures the data non-redundantly and with less need for controversial decisions in classification. The focus here shifts to criteria of acceptability; for that purpose a set of conversational postulates governing oral speech are introduced. Seen from this angle, all the hesitation phenomena (except parenthetical remarks which are not affected) are divided into just two major classes. These are labelled ’stalls’ and ’repair’. It is shown that the data support such an analysis, but this also makes it necessary to re-analyze the raw data accordingly; this is accomplished in Chapter Six, along with the presentation of the derivative set of data. Chapter Seven, finally, provides a summary of findings and interpretations and, in addition, some conclusions which may be drawn from the study for linguistics and language teaching.

  • Joan Fayer, and Emily Krasinski, “Native and Nonnative Judgments of Intelligibility and Irritation,” Language Learning, vol. 37, no. 3, 1987, pp. 313-326. DOI: 10.1111/j.1467-1770.1987.tb00573.x.

    Abstract This study compares the reactions of native English speakers and native Spanish speakers who listened to tapes of Puerto Rican learners of English of various levels of proficiency. The listeners completed a questionnaire that examines the following variables: intelligibility, grammar, pronunciation, intonation, wrong words, voice, hesitations, distraction and annoyance. It was found that the English and Spanish listeners differed principally in how they rated the linguistic form of the speakers and in the annoyance reported. The Spanish listeners rated the linguistic form much lower than did the English listeners and also reported more annoyance. This indicates that the Spanish listeners were less tolerant toward nonnative speech than were the English listeners. In addition, pronunciation and hesitations were reported by both groups of listeners to be, overall, the features most distracting from the message.

  • Caroline L. Rieger, “Disfluencies and hesitation strategies in oral L2 tests,” in Disfluency in Spontaneous Speech (DiSS ’03) (Gothenburg Papers in Theoretical Linguistics), vol. 90, Göteborg, Sweden, September 2003, pp. 41-44.

    Abstract This paper presents an investigation of hesitation strategies of intermediate learners of German as a second or foreign language (L2) when they take part in oral L2 tests. Previous studies of L2 hesitation strategies have focused on beginning and advanced L2 learners. They found that beginners tend to leave their hesitation pauses unfilled making their speech highly disfluent [17], while advanced L2 speakers - similar to native speakers - use a variety of fillers. In oral L2 tests, intermediate learners hesitate mainly for two reasons: to search for a German word or structure, or to think about the content of their utterance. Some participants use a variety of strategies to signal to the addressee that they are hesitating. This variety is not as rich as it is for advanced L2 learners or native speakers. Other participants leave their hesitation pauses unfilled or rely on quasi-lexical fillers to hold the floor when hesitating.

    Keywords DiSS

  • Ralph L. Rose, “Filled Pauses in Language Teaching: Why and How,” Bulletin of Gunma Prefectural Women’s University, vol. 29, 2008, pp. 47-64.

    Abstract Filled Pauses (uh, um) are ubiquitous elements of spontaneous speech but have received relatively little attention in second language teaching. Perhaps this is because filled pauses have often been regarded as meaningless elements resulting from speech processing difficulties. This paper draws from research in widely disparate fields to show that speakers and listeners use them systematically and meaningfully. These facts are used to generate a unified and coherent model of filled pauses in spontaneous speech. This model is then used to develop a concept of communicative competence in which filled pauses play a role at the interface between pragmatic constraints and communication strategies. The article concludes with practical recommendations for how filled pauses may be incorporated into the second-language teaching curriculum.

  • Tracey M. Derwing, Murray J. Munro, Ron I. Thomson, and Marian J. Rossiter, “The Relationship between L1 Fluency and L2 Fluency Development,” Studies in Second Language Acquisition, vol. 31, no. 4, December 2009, pp. 533-557. DOI: 10.1017/S0272263109990015.

    Abstract A fundamental question in the study of second language (L2) fluency is the extent to which temporal characteristics of speakers’ first language (L1) productions predict the same characteristics in the L2. A close relationship between a speaker’s L1 and L2 temporal characteristics would suggest that fluency is governed by an underlying trait. This longitudinal investigation compared L1 and L2 English fluency at three times over 2 years in Russian- and Ukrainian- (which we will refer to here as Slavic) and Mandarin-speaking adult immigrants to Canada. Fluency ratings of narratives by trained judges indicated a relationship between the L1 and the L2 in the initial stages of L2 exposure, although this relationship was found to be stronger in the Slavic than in the Mandarin learners. Pauses per second, speech rate, and pruned syllables per second were all related to the listeners’ judgments in both languages, although vowel durations were not. Between-group differences may reflect differential exposure to spoken English and a closer relationship between Slavic languages and English than between Mandarin and English. Suggestions for pedagogical interventions and further research are also proposed.

  • Ralph L. Rose, “A Comparison of Form and Temporal Characteristics of Filled Pauses in L1 Japanese and L2 English,” Journal of the Phonetic Society of Japan, vol. 21, no. 3, 2017, pp. 33-40. DOI: 10.24467/onseikenkyu.21.3_33.

    Abstract Filled pauses (FPs) in English can be either monophonemic ‘uh’ [ə] or polyphonemic ‘um’ [əm]. These differ temporally: shorter ‘uh’ is associated with shorter overall delay (including silent pauses). Japanese FPs are more varied, including both monophonemic ([ε], [ŋ]) and polyphonemic ([ε:to], [ɑno]) forms. This study compares the FPs of native Japanese speakers in a crosslinguistic speech corpus. Results show speakers use FPs with a lower F1 than native English speakers and strongly prefer the monophonemic form. Duration patterns are similar, but low proficiency speakers delay longer with monophonemic FPs. Results suggest possibilities for nonnative speech detection in speech applications.

  • Justin J. H. Lo, “Between Äh(m) and Euh(m): The Distribution and Realization of Filled Pauses in the Speech of German-French Simultaneous Bilinguals,” Language and Speech, In press. DOI: 10.1177/0023830919890068.

    Abstract Filled pauses are well known for their speaker specificity, yet cross-linguistic research has also shown language-specific trends in their distribution and phonetic quality. To examine the extent to which speakers acquire filled pauses as language- or speaker-specific phenomena, this study investigates the use of filled pauses in the context of adult simultaneous bilinguals. Making use of both distributional and acoustic data, this study analyzed UH, consisting of only a vowel component, and UM, with a vowel followed by [m], in the speech of 15 female speakers who were simultaneously bilingual in French and German. Speakers were found to use UM more frequently in German than in French, but only German-dominant speakers had a preference for UM in German. Formant and durational analyses showed that while speakers maintained distinct vowel qualities in their filled pauses in different languages, filled pauses in their weaker language exhibited a shift towards those in their dominant language. These results suggest that, despite high levels of variability between speakers, there is a significant role for language in the acquisition of filled pauses in simultaneous bilingual speakers, which is further shaped by the linguistic environment they grow up in.

  • Michiko Watanabe, Keikichi Hirose, Yasuharu Den, and Nobuaki Minematsu, “Filled pauses as cues to the complexity of upcoming phrases for native and non-native listeners,” Speech Communication, vol. 50, no. 2, February 2008, pp. 81-94. DOI: 10.1016/j.specom.2007.06.002.

    Abstract We examined whether filled pauses (FPs) affect listeners’ predictions about the complexity of upcoming phrases in Japanese. Studies of spontaneous speech corpora show that constituents tend to be longer or more complex when they are immediately preceded by FPs than when they are not. From this finding, we hypothesized that FPs cause listeners to expect that the speaker is going to refer to something that is likely to be expressed by a relatively long or complex constituent. In the experiments, participants listened to sentences describing both simple and compound shapes on a computer screen. Their task was to press a button as soon as they had identified the shape corresponding to the description. Phrases describing shapes were immediately preceded by a FP, a silent pause of the same duration, or no pause. We predicted that listeners’ response times to compound shapes would be shorter when there is a FP before phrases describing the shape than when there is no FP, because FPs are good cues to complex phrases, whereas response times to simple shapes would not be shorter with a preceding FP than without. The results of native Japanese and proficient non-native Chinese listeners agreed with the prediction and provided evidence to support the hypothesis. Response times of the least proficient non-native listeners were not affected by the existence of FPs, suggesting that the effects of FPs on non-native listeners depend on their language proficiency.

  • Sandra Götz, Fluency in Native and Nonnative English Speech. Amsterdam, Netherlands: John Benjamins Publishing Company.2013, pp. 238. DOI: 10.1075/scl.53.$#$catalog/books/scl.53/main.

    Abstract This book takes a new and holistic approach to fluency in English speech and differentiates between productive, perceptive, and nonverbal fluency. The in-depth corpus-based description of productive fluency points out major differences of how fluency is established in native and nonnative speech. It also reveals areas in which even highly advanced learners of English still deviate strongly from the native target norm and in which they have already approximated to it. Based on these findings, selected learners are subjected to native speakers’ ratings of seven perceptive fluency variables in order to test which variables are most responsible for a perception of oral proficiency on the sides of the listeners. Finally, language-pedagogical implications derived from these findings for the improvement of fluency in learner language are presented. This book is conceptually and methodologically relevant for corpus-linguistics, learner corpus research and foreign language teaching and learning.

  • Hans Rutger Bosker, Hugo Quené, Ted Sanders, and Nivja H. de Jong, “Native ‘um’s elicit prediction of low-frequency referents, but non-native ‘um’s do not,” Journal of Memory and Language, vol. 75, 2014, pp. 104 - 116. DOI:

    Abstract Speech comprehension involves extensive use of prediction. Linguistic prediction may be guided by the semantics or syntax, but also by the performance characteristics of the speech signal, such as disfluency. Previous studies have shown that listeners, when presented with the filler uh, exhibit a disfluency bias for discourse-new or unknown referents, drawing inferences about the source of the disfluency. The goal of the present study is to study the contrast between native and non-native disfluencies in speech comprehension. Experiment 1 presented listeners with pictures of high-frequency (e.g., a hand) and low-frequency objects (e.g., a sewing machine) and with fluent and disfluent instructions. Listeners were found to anticipate reference to low-frequency objects when encountering disfluency, thus attributing disfluency to speaker trouble in lexical retrieval. Experiment 2 showed that, when participants listened to disfluent non-native speech, no anticipation of low-frequency referents was observed. We conclude that listeners can adapt their predictive strategies to the (non-native) speaker at hand, extending our understanding of the role of speaker identity in speech comprehension.

    Keywords Speech comprehension

  • Lorenzo García-Amaya, “A longitudinal study of filled pauses and silent pauses in second language speech,” in The 7th Workshop on Disfluency in Spontaneous Speech (DiSS 2015), Edinburgh, Scotland, August 2015.

    Abstract This study provides a longitudinal analysis of speech rate and the use of filled pauses (FPs) and unfilled or silent pauses (SPs) in the oral production of L2 learners of Spanish in two learning contexts: a 6-week intensive overseas immersion program (OIM), and a 15-week US-based ‘at-home’ foreign language classroom (AH). Fifty-six native speakers of English performed two video-retell tasks at three different time points. A total of five measurements of oral production were calculated. The results show a significant increase in rate of speech over time in the OIM group compared to the AH group. Additionally, the OIM learners show greater use of “disfluencies” over time, namely FPs and short Sps. We suggest that OIM learners increase their use of hesitation phenomena over time as a speech processing and planning strategy and discuss this finding within the framework of L2 cognitive Fluency.

    Keywords disfluencies, DiSS, filled pauses, rate of speech, second language fluency, silent pauses, Spanish, study abroad