FPRC — The 4th Workshop on Disfluency in Spontaneous Speech (DiSS 2005)

Timothy Arbisi-Kelm, and Sun-Ah Jun, “A comparison of disfluency patterns in normal and stuttered speech,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 13-16. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_013.pdf.

Abstract While speech disfluencies are commonly found in every speaker's speech, stuttering is a language disorder characterized by an abnormally high rate of speech aberrations, including prolongation, cessation, and repetition of speech segments. However, despite the obvious differences between stuttered and normal speech, identifying the crucial qualities that identify stuttered speech remains a significant challenge. A story-telling task was presented to four stutterers and four non-stutterers in order to analyze the prosodic patterns that surfaced from their spontaneous narrations. Preliminary results revealed that the major difference between stutterers' and non-stutterers' disfluencies—aside from the total number—is the type of disfluency and the context affected by the disfluency. Disfluencies in both groups included prolongation, pause and cut, but stutterers' disfluencies also include repetition and combinations of the three (e.g., cut followed by pause). In addition, stutterers' disfluencies were accompanied by more prosodic irregularities (e.g. pitch accent on function words, creating a prosodic break with degraded phonetic cues) prior to the actual disfluency than non-stutterers' disfluencies, indirectly supporting the overvigilant self-monitoring hypothesis.

Keywords DiSS

Matthew P. Aylett, “Extracting the acoustic features of interruption points using non-lexical prosodic analysis,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 17-20. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_017.pdf.

Abstract Non-lexical prosodic analysis is our term for the process of extracting prosodic structure from a speech waveform without reference to the lexical contents of the speech. It has been shown that human subjects are able to perceive prosodic structure within speech without lexical cues. There is some evidence that this extends to the perception of disfluency, for example, the detection interruption points (IPs) in low pass filtered speech samples. In this paper, we apply non-lexical prosodic analysis to a corpus of data collected for a speaker in a multi-person meeting environment. We show how non-lexical prosodic analysis can help structure corpus data of this kind, and reinforce previous findings that non-lexical acoustic cues can help detect IPs. These cues can be described by changes in amplitude and f0 after the IP and they can be related to the acoustic characteristics of hyper-articulated speech.

Keywords DiSS

Katarina Bartkova, “Prosodic cues of spontaneous speech in French,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 21-25. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_021.pdf.

Abstract Disfluencies, when present in speech signal, can make syntactic parsing difficult. This difficulty is increased when machines are involved in communication and when speech devices rely on automatic speech recognition techniques. In order to improve automatic speech parsing and thus speech comprehension, methods have been proposed to filter disfluencies out from the speech signal. Attempts have been made to use prosodic parameters to improve such a filtering. However, before introducing prosodic parameters into automatic speech recognition processes, it would be useful to investigate whether disfluencies can be characterized in a prosodic way and whether their prosodic cues would be representative enough to be used in automatic systems. The aim of this study was to examine to which extent prosodic parameters would be able to characterize disfluencies in French. Word repetitions, filled and silent pauses and speech repairs were described in a prosodic way using statistical analyses of their prosodic parameters. These analyses allowed simple prosodic rules to be formulated. The efficiency of the prosodic rules was evaluated on the task of filled pauses, word repetitions and hesitation detections.

Keywords DiSS

Philippe Boula de Mareüil, Benoît Habert, Frédérique Bénard, Martine Adda-Decker, Claude Barras, Gilles Adda, and Patrick Paroubek, “A quantitative study of disfluencies in French broadcast interviews,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 27-32. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_027.pdf.

Abstract The reported study aims at increasing our understanding of spontaneous speech-related phenomena from sibling corpora of speech and orthographic transcriptions at various levels of elaboration. It makes use of 9 hours of French broadcast interview archives, involving 10 journalists and 10 personalities from political or civil society. First we considered press-oriented transcripts, where most of the so-called disfluencies are discarded. They were then aligned with automatic transcripts, by using the LIMSI speech recogniser. This facilitated the production of exact transcripts, where all audible phenomena in non-overlapping speech segments were transcribed manually. Four types of disfluencies were distinguished: discourse markers, filled pauses, repetitions and revisions, each of which accounts for about 2% of the corpus (8% in total). They were analysed by utterance, speaker and disfluency pattern types. Four question were raised. Where do disfluencies occur in the utterance? What is the influence of the speakers' status? And what are the most frequent disfuency patterns?

Keywords DiSS

Jean-Leon Bouraoui, and Nadine Vigouroux, “Disfluency phenomena in an apprenticeship corpus,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 33-37. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_033.pdf.

Abstract This papers presents a study carried out on an apprenticeship corpus. It features dialogues between air traffic controllers in formation and "pseudo-pilots". "Pseudo-pilots" are people (often instructors) that simulate the behavior of real pilots, in real situations. Its main specificities are the apprenticeship characteristic, and the fact that the production is subordinate to a particular phraseology. Our study is related to the many kinds of disfluency phenomena that occur in this specific corpus. We define 6 main categories of these phenomena, and take position in regard to the terminology used in literature. We then present the distribution of these categories. It appears that some of the occurrences frequencies largely differs from those observed in other studies. Our explanation is based on the corpus specificity: in reason of their responsibilities, both controllers and pseudo-pilots have to be especially careful to the mistakes they could do, since they could lead to some dramas. The remainder of our paper is dedicated to the more deepen study of a disfluency class: the "false starts". It consists of the beginning utterance of a word, that is not achieved. We show that this category consists of several sub-categories, of which we study the distribution.

Keywords DiSS

Pierpaolo Busan, Giovanna Pelamatti, Alessandro Tavano, Michele Grassi, and Franco Fabbro, “Improvement of verbal behavior after pharmacological treatment of developmental stuttering: a case study,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 39-42. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_039.pdf.

Abstract Developmental stuttering is a disruption in normal speech fluency and rhythm. Developmental stuttering usually manifests between 6 and 9 years of age and may persist in adulthood. At present, the exact etiology of developmental stuttering is not fully clear. Besides, the dopaminergic neurological component is likely to have a causal role in the manifestation of stuttering behaviors. Actually, some studies seem to confirm the efficacy of antidopaminergic drugs (haloperidol, risperidone and olanzapine, among others) in controlling stuttering behaviors. We present a case of persistent developmental stuttering in a 24-year-old adult male who was able to control his symptoms to a significant extent after administration of risperidone, an antidopaminergic drug. Our findings show that the pharmacological intervention helped the patient improve on a set of fluency tasks but especially when the tasks involved the uttering of content words. Our results are discussed against the current theories on the cognitive and neurological basis of developmental stuttering.

Keywords DiSS

Estelle Campione, and Jean Véronis, “Pauses and hesitations in French spontaneous speech,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 43-46. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_043.pdf.

Abstract In traditional terminology, silent and filled pauses are grouped together, whereas hesitation lengthening is put into a separate category. However, while these various phenomena are very often associated, there have been few studies on how they interact. We analyzed an hour of spontaneous speech to show that silent and filled pauses operate in a totally different way, and that contrary to common belief, silent pauses by themselves never serve as hesitation markers, but only do so when coupled with other markers – mostly syllabic lengthening and filled pauses. These last two hesitation markers have similar acoustic and articulatory characteristics; they are also distributed and function alike.

Keywords DiSS

Maria Candea, Ioana Vasilescu, and Martine Adda-Decker, “Inter- and intra-language acoustic analysis of autonomous fillers,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 47-51. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_047.pdf.

Abstract The present work deals with autonomous fillers in a multilingual context. The question addressed here is whether fillers are carrying universal or language-specific characteristics. Fillers occur frequently in spontaneous speech and represent an interesting topic for improving language-specific models in automatic language processing. Most of the current studies focus on few languages such as English and French. We focus here on multilingual fillers resulting from eight languages (Arabic, Mandarin Chinese, French, German, Italian, European Portuguese, American English and Latin American Spanish). We propose thus an acoustic typology based on the vocalic peculiarities of the autonomous fillers. Three parameters are considered here: duration, pitch (F0) and timbre (F1/F2). We also compare the vocalic segments of the fillers with intra-lexical vowels possessing similar timbre. In this purpose, a preliminary study on French language is described.

Keywords DiSS

Jennifer Cole, Mark Hasegawa-Johnson, Chilin Shih, Heejin Kim, Eun-Kyung Lee, Hsin-yi Lu, Yoonsook Mo, and Tae-Jin Yoon, “Prosodic parallelism as a cue to repetition and error correction disfluency,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 53-58. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_053.pdf.

Abstract Complex disfluencies that involve the repetition or correction of words are frequent in conversational speech, with repetition disfluencies alone accounting for over 20% of disfluencies. These disfluencies generally do not lead to comprehension errors for human listeners. We propose that the frequent occurrence of parallel prosodic features in the reparandum (REP) and alteration (ALT) intervals of complex disfluencies may serve as strong perceptual cues that signal the disfluency to the listener. We report results from a transcription analysis of complex disfluencies that classifies disfluent regions on the basis of prosodic factors, and preliminary evidence from F0 analysis to support our finding of prosodic parallelism.

Keywords DiSS

Andrew A. Cooper, and John T. Hale, “Promotion of disfluency in syntactic parallelism,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 59-63. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_059.pdf.

Abstract The development of a disfluency-robust speech parser requires some insight into where disfluencies occur in spontaneous spoken language. This corpus study deals with one syntactic variable which is predictive of disfluency location: syntactic parallelism. A formal definition of syntactic parallelism is used to show that syntactic parallelism is indeed predictive of disfluency.

Keywords DiSS

Rodolfo Delmonte, “Modeling conversational styles in Italian by means of overlaps,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 65-70. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_065.pdf.

Abstract Conversational styles vary cross-culturally remarkably: communities of speakers—rather than single speakers - seem to share turn-taking rules which do not always coincide with those shared by other communities of the same language. These rules are usually responsible for the smoothness of conversational interaction and the readiness of the attainment of communicative goals by conversants. Overlaps constitute a disruptive element in the economy of conversations: however, they show regular patterns which can be used to define conversational styles (Ford and Thompson, 1996). Overlaps constitute a challenge for any system of linguistic representations in that they cannot be treated as a one-dimensional event: in order to take into account the purport of an overlapping stretch of dialogue for the ongoing pragmatics and semantics of discourse, we have devised a new annotation schema which is then fed into the parser and produces a multidimensional linear syntactic constituency representation. This study takes a new tack on the issues raised by overlaps, both in terms of its linguistic representation and its semantic and pragmatic interpretation. It will present work carried out on the 60,000 words Italian Spontaneous Speech Corpus called AVIP, under national project API - the Italian version of MapTask, in particular the parser, to produce syntactic structures of overlapped temporally aligned turns. We will also present preliminary data from IPAR, another corpus of spontaneous dialogues run with the Spot Differences protocol. Then it will concentrate on the syntactic, semantic and prosodic aspects related to this debated issue. The paper will argue in favour of a joint and thus temporally aligned representation of overlapping material to capture all linguistic information made available by the local context. This will result in a syntactically branching node we call OVL which contains both the overlapper's and the overlappee's material (linguistic or non-linguistic). An extended classification of the phenomenon has shown that overlaps contribute substantially to the interpretation of the local context rather than the other way around. They also determine the overall conversational style of a given community of speakers with cultural import.

Keywords DiSS

Janet Fletcher, Nicholas Evans, and Belinda Ross, “The intra-word pause and disfluency in Dalabon,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 77-81. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_077.pdf.

Abstract Earlier impressionistic analyses of Dalabon indicate that the grammatical word is often realized as either an accentual or an intonational phrase, followed by a pause. Unusually, it can also be interrupted by a silent pause, with each section being potentially (although not necessarily) realized as separate intonational phrases. Our analyses of pause duration and pause placement within grammatical words support these earlier impressions, although this use of the silent pause appears to be restricted to certain affix boundaries, and other phonological constraints relating to the following surrounding linguistic material. These interruptions also share certain characteristics of "normal" disfluencies however.

Keywords DiSS

Kristy Beers Fägersten, “Hesitations and repair in German,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 71-76. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_071.pdf.

Abstract The occurrence of pauses and hesitations in spontaneous speech has been shown to occur systematically, for example, "between sentences, after discourse markers and conjunctions and before accented content words." (Hansson [15]) This is certainly plausible in English, where pauses and hesitations can and often do occur before content words such as nominals, for example, "uh, there's a ... man." (Chafe [8]) However, if hesitations are, in fact, evidence of "deciding what to talk about next," (Chafe [8]) then the complex grammatical system of German should render this pausing position precarious, since pre-modifiers must account for the gender of the nominals they modify. In this paper, I present data to test the hypothesis that pre-nominal hesitation patterns in German are dissimilar to those in English. Hesitations in German will be shown, in fact, to occur within noun phrase units. Nevertheless, native speakers most often succeed in supplying a nominal which conforms to the gender indicated by the determiner or pre-modifier. Corrections, or repairs, of infelicitous pre-modifiers indicate that the speaker was unable to supply a nominal of the same gender which the choice of pre-modifier had committed him/her to. The frequency of such repairs is shown to vary according to task, with fewest repairs occurring in elicited speech which allows for linguistic freedom and therefore is most like spontaneous speech. The data sets indicate that among German native speakers, hesitations occurring before noun phrase units (pre-NPU hesitations) indicate deliberation of what to say, while hesitations within or before the head of the noun phrase (pre-NPH hesitations) indicate deliberation of how to say what has already been decided (cf. Chafe [8]).

Keywords DiSS

Tiit Hennoste, “Repair-initiating particles and um-s in Estonian spontaneous speech,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 83-88. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_083.pdf.

Abstract Particles and um-s used in spontaneous Estonian speech as initiators of different types of repair are analysed. Our model and typology of repair based on conversation analysis is introduced. Three main types of repair and particles used to initiate those are described: prepositioned self-initiated self-repair, postpositioned self-initiated self-repair (addition, substitution, insertion and abandon), and other-initiated self-repair (reformulation, clarification and misunderstanding). In conclusion 6 groups of particles are brougth out by the role they play in the initiation of the repair sequence. Data come from Corpus of Spoken Estonian of the University of Tartu, which contains everyday and institutional speech, telephone and face-to-face conversations.

Keywords DiSS

Sandrine Henry, “Repeats in spontaneous spoken French: the influence of the complexity of phrases,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 89-92. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_089.pdf.

Abstract We here present the results of a descriptive study we conducted on 383 disfluent repeats from a corpus of spontaneous spoken French. We analyze noun phrases under construction and study whether there is a co-relation between the frequency of the repeats and the complexity feature of the phrases. We then focus on complex noun phrases in order to locate precisely the repeats. We also analyze how repeats affect structures such as [Preposition + Determiner + Noun] and what the constraints upon such structures are.

Keywords DiSS

Peter Howell, and Olatunji Akande, “Simulations of the types of disfluency produced in spontaneous utterances by fluent speakers, and the change in disfluency type seen as speakers who stutter get older,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 93-98. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_093.pdf.

Abstract The EXPLAN model is implemented on a graphic simulator. It is shown that it is able to produce speech in serial order and several types of fluency failure produced by fluent speakers and speakers who stutter. A way that EXPLAN accounts for longitudinal changes in the pattern of fluency failures shown by speakers who stutter is demonstrated.

Keywords DiSS

Peter Howell, Jennifer Hayes, Ceri Savage, Jane Ladd, and Nafisa Patel, “Factors that determine the form and position of disfluencies in spontaneous utterances,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 99-102. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_099.pdf.

Abstract This presentation reviews work on types of disfluency in the spontaneous speech of fluent speakers and speakers who stutter. Examination is made of factors that determine where disfluencies are located. It is concluded that the phonological, or prosodic, word provides a good basis for explaining the distribution of different types of disfluency in spontaneous speech.

Keywords DiSS

T. Florian Jaeger, “Optional 'that' indicates production difficulty: evidence from disfluencies,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 103-108. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_103.pdf.

Abstract Optional word omission, such as that omission in complement and relative clauses, has been argued to be driven by production pressure (rather than by comprehension). One particularly strong production-driven hypothesis states that speakers insert words to buy time to alleviate production difficulties. I present evidence from the distribution of disfluencies in non-subject-extracted relative clauses arguing against this hypothesis. While word omission is driven by production difficulties, speakers may use that as a collateral signal to addressees, informing them of anticipated production difficulties. In that sense, word omission would be subject to audience design (i.e. catering to addressees' needs).

Keywords DiSS

Jumpei Kaneda, “Phrase-final rise-fall intonation and disfluency in Japanese - a preliminary study,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 109-112. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_109.pdf.

Abstract In Japanese conversations, rise-fall intonation with vowel lengthening often occurs on the final syllable of a phrase. This phrase-final rise-fall (PFRF) is a new type of intonation first reported in the 1960's. Researchers consider PFRF intonation a discourse marker which functions to sharpen the phrase boundary and retain the utterance turn, but other phrase-final intonation such as phrase-final lengthening (PFL) can have a similar pattern. PFLs are recognized as a type of disfluent speech with similar characteristics to PFRFs in terms of final-lengthening and having discourse functions. Also from reports about the spontaneity of speech, we assume that PFRFs would have a relation with disfluency, as well as with PFLs. To examine this assumption, this paper attempts to show the co-occurrence relation between PFRF and disfluency in the same utterance. The results show that PFRFs and PFLs have a relation to posterior disfluent units and suggest that both indicate speech planning strategies. Further, this paper speculates that a difference between PFRF and PFL is a difference in the purposes of speech planning: the latter represents ongoing linguistic editing while the former indicates adjusting the utterance according to the interlocutor's reaction. Disfluencies accordingly occur as effects from processes of speech planning.

Keywords DiSS

Shigeyoshi Kitazawa, “Evaluation of vowel hiatus in prosodic boundaries of Japanese,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 113-116. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_113.pdf.

Abstract We investigated V-V hiatus through J-ToBI labeling and listening to whole phrases to estimate degree of discontinuity and, if possible, to determine the exact boundary between two phrases. Appropriate boundaries were found in most cases as the maximum perceptual score. Using electroglottography (EGG) of the open quotients OQ, pitch mark and spectrogram, the acoustic phonological feature of these V-V hiatus was found as phrase-initial glottalization and phrase-final nasalization observable in EGG and spectrogram, as well as phrase-final lengthening and phrase-initial shortening of the morae. A small dip was observable at the boundary of V-V hiatus showing glottalization. The test materials are taken from the "Japanese MULTEXT", consisting of a particle - vowel (36), adjective - vowel (5), and word - word (4).

Keywords DiSS

Che-Kuang Lin, Shu-Chuan Tseng, and Lin-Shan Lee, “Important and new features with analysis for disfluency interruption point (IP) detection in spontaneous Mandarin speech,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 117-121. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_117.pdf.

Abstract This paper presents a whole set of new features, some duration-related and some pitch-related, to be used in disfluency interruption point (IP) detection for spontaneous Mandarin speech, considering the special linguistic characteristics of Mandarin Chinese. Decision tree is incorporated into the maximum entropy model to perform the IP detection. By examining performance degradation when each specific feature was missing from the whole set, the most important features for IP detection for each disfluency type were analyzed in detail. The experiments were conducted on the Mandarin Conversational Dialogue Corpus (MCDC) developed by the Institute of Linguistics of Academia Sinica in Taiwan.

Keywords DiSS

Tobias Lövgren, and Jan van Doorn, “Influence of manipulation of short silent pause duration on speech fluency,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 123-126. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_123.pdf.

Abstract Ordinary speech contains disfluencies in the form of hesitations and repairs. When listeners make global judgements on speech fluency they are influenced by the frequency and nature of the individual disfluencies contained in the speech. The aim of this study was to investigate a single dimension, pause duration, in the perception of speech fluency. The method involved simulation of pause duration within naturally fluent speech by manipulating existing acoustic silences in the speech. Four conditions were created: one for the natural speech and three with step wise increases in acoustic silence durations (average x2, x4 and x7.5 respectively). In a forced choice task listeners were asked to judge the speech samples as fluent or non fluent. The results showed that the percentage of judgements of disfluency increased as the pause durations increased, and that the difference between the unmanipulated speech condition and the two conditions with the longest pause durations were statistically significant. The results were interpreted to indicate that the individual dimension of pause duration has an independent influence on the judgement of fluency in ordinary speech.

Keywords DiSS

Elgar-Paul Magro, “Disfluency markers and their facial and gestural correlates. preliminary observations on a dialogue in French,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 127-131. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_127.pdf.

Abstract The aim of this article is to try to establish any observable regularities between the vocal and the visual expression of disfluency markers in a French spontaneous dialogue. The data show different configurations for different types of disfluency markers. Thus "euh"s are typically accompanied by mutual eye contact and no gesture; interrupted eye contact takes place less frequently, on occasions where speech planning is more seriously impaired (syntactical disruption and combination of "euh" with other disfluency markers). False starts seem to be typically accompanied by gesture production whereas eye contact can be maintained if the speaker relies or not on the listener to resolve the speech production problem. The article takes up the idea that disfluency markers can be classified along a continuum throughout the speech formulation process, going from the most discreet to the most prominent. It suggests that the more prominent the disfluency, the more likely is the visual channel to play a role (interrupted eye contact and gesture production).

Keywords DiSS

Jan McAllister, and Mary Kingston, “Characteristics of final part-word repetitions,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 7-11. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_007.pdf.

Abstract In an earlier paper, we have described final part-word repetitions in the conversational speech of two school-age boys of normal intelligence with no known neurological lesions. In this paper we explore in more detail the phonetic and linguistic characteristics of the speech of the boys. The repeated word fragments were more likely to be preceded by a pause than followed by one. The word immediately following the fragment tended to have a higher word frequency score than other surrounding words. Utterances containing the disfluencies typically contained a greater number of syllables than those that did not; however, there was no reliable difference between fluent and disfluent utterances in terms of their grammatical complexity.

Keywords DiSS

Hannele Nicholson, Ellen Gurman Bard, Robin Lickley, Anne H. Anderson, Catriona Havard, and Yiya Chen, “Disfluency and behaviour in dialogue: evidence from eye-gaze,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 133-138. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_133.pdf.

Abstract Previous research on disfluency types has focused on their distinct cognitive causes, prosodic patterns, or effects on the listener. This paper seeks to add to this taxonomy by providing a psycholinguistic account of the dialogue and gaze behaviour speakers engage in when they make certain types of disfluency. Dialogues came from a version of the Map Task, [2, 4], in which 36 normal adult speakers each participated in six dialogues across which feedback modality and time-pressure were counter-balanced. In this paper, we ask whether disfluency, both generally and type-specifically, was associated with speaker attention to the listener. We show that certain disfluency types can be linked to particular dialogue goals, depending on whether the speaker had attended to listener feedback. The results shed light on the general cognitive causes of disfluency and suggest that it will be possible to predict the types of disfluency which will accompany particular behaviours.

Keywords DiSS

Sieb Nooteboom, “Lexical bias re-re-visited. some further data on its possible cause.,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 139-144. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_139.pdf.

Abstract This paper describes an experiment eliciting spoonerisms by using the so-called SLIP technique. The purpose of the experiment was to provide a further test of the hypothesis that self-monitoring of inner speech is a major source of lexical bias. This is a follow-up on an earlier experiment in which subjects were explicitly prompted after each response to make a correction in case of a speech error. In the current experiment both the prompt and the extra time for correction were left out, and there was no strong time pressure for the subject in giving his response. It is shown that under these conditions many primed-for spoonerisms are replaced by other, mostly lexical, errors. These 'replacing' or 'secondary' errors are more frequent in the condition priming for nonword-nonword errors than in the condition priming for word-word errors. Response times obtained for replacing errors are considerably and significantly longer than response times for overtly interrupted errors, and also longer than response times for the primed-for spoonerisms. This suggests that a time-consuming operation follows the primed-for spoonerisms in inner speech, and replaces those with other speech errors, often to preserve lexicality of the error.

Keywords DiSS

Berthille Pallaud, “The re-adjustment of word-fragments in spontaneous spoken French,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 145-149. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_145.pdf.

Abstract A study of word-fragments in spoken French has been undertaken for a few years on the basis of non directive talks corpora recorded and transcribed according to GARS' conventions (DELIC currently). These disfluencies are often analyzed within the framework of disfluent repetitions. The observations made on these two types of disfluencies led us to distinguish them. The aim of our study is to describe on the one hand insertions which take place in relation to the word interruptions and their re-adjustment, and on the other hand, to specify the types and localizations of retracing which follow these interruptions. Two kinds of incidental clauses were observed at the time of the readjustments which follow these disturbances. Some, (the more numerous) are syntactically linked to the fragment or with its retracing, others are not. Moreover, the word-fragments which will be modified are the only one to be dependent on the type of localization. For the others, this localization does not make it possible to predict the category of interruption (complemented or unfinished). Our results on word-fragments, confirm however that in contemporary French, the retracing at the head of the nominal or verbal group which contains the disfluency remains the simplest example (at the same time the most frequent, [5]. Nevertheless, a third of the retracing either does not go back to the beginning of the Group, or exceeds it.

Keywords DiSS

Myriam Piccaluga, Jean-Luc Nespoulous, and Bernard Harmegnies, “Disfluencies as a window on cognitive processing. an analysis of silent pauses in simultaneous interpreting,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 151-155. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_151.pdf.

Abstract The paper focuses on silent pauses observed in the productions of subjects involved in simultaneous interpreting tasks. Four bilingual subjects with various degrees of expertise in interpreting and various degrees of mastery of the languages involved (French and Spanish) have been recorded while interpreting utterances of French and Spanish talks. The source discourses had been perturbated by changes both in speech rates (by time compression) and in auditory quality (by addition of a parasiting noise). On the basis of acoustical analyzes performed on the subjects' productions, statistical analyzes focus both on the number and on the duration of the observed pauses. This double approach enables investigations of the kind of cognitive disturbances caused by the independent variables and allows further speculation on the semiology of the pauses durations.

Keywords DiSS

Melanie Soderstrom, and James L. Morgan, “Disfluency in speech input to infants? The interaction of mother and child to create error-free speech input for language acquisition,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 157-162. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_157.pdf.

Abstract One characteristic of infant-directed speech is that it is highly fluent compared with adult-directed speech. However, the speech that infants hear still contains disfluencies. Such disfluencies might potentially cause problems for infants during language development. We first analyzed samples of spontaneous speech in the presence of infants (both adult- and infant-directed) and found that under ideal circumstances the speech infants hear is highly fluent. Under less than ideal circumstances infants hear much more highly disfluent speech - however this disfluent speech is almost entirely adult-directed. While grammatically ill-formed, the prosodic structure of these disfluencies might signal their ill-formedness to the infants. In a preference experiment, 10 month olds listened longer to infant-directed speech samples containing prosodic disfluencies than to equated samples without disfluency. However, this effect was found in only one of two counterbalancing groups. Using adult ratings of low-pass versions of these speech samples, we found that infants' preferences were correlated with the adults' perception of the relative disfluency of the samples. A follow-up experiment using adult-directed disfluencies found that while the 10 month olds showed no differences in their listening preferences, older infants preferred to listen to the fluent speech. These results suggest that younger and older infants attend differently to infant and adult-directed speech, and that older infants may be able to differentiate grammatical adult-directed input from input distorted by disfluency. We discuss implications of these findings for language acquisition.

Keywords DiSS

Ellen Thompson, “A cross-linguistic look at VP-ellipsis and verbal speech errors,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 163-164. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_163.pdf.

Abstract This paper argues that consideration of spontaneous speech errors provides insight into cross-linguistic analyses of syntactic phenomena. In particular, I claim that differences in the distribution of non-parallel VP-Ellipsis constructions in English and German, as well as variation in the spontaneously-occurring verbal speech errors, is explained by a parametric analysis of variation in the inflectional systems of the two languages.

Keywords DiSS

Doroteo T. Toledano, Antonio Moreno Sandoval, José Colás Pasamontes, and Javier Garrido Salas, “Acoustic-phonetic decoding of different types of spontaneous speech in Spanish,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 165-168. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_165.pdf.

Abstract This paper presents preliminary acoustic-phonetic decoding results for Spanish on the spontaneous speech corpus C-ORAL-ROM. These results are compared with results on the read speech corpus ALBAYZIN. We also compare the decoding results obtained with the different types of spontaneous speech in C-ORAL-ROM. As the most important conclusions, the experiments show that the type of spontaneous speech has a deep impact on spontaneous speech recognition results. Best speech recognition results are those obtained on speech captured from the media.

Keywords DiSS

Michiko Watanabe, Yasuharu Den, Keikichi Hirose, and Nobuaki Minematsu, “The effects of filled pauses on native and non-native listeners' speech processing,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 169-172. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_169.pdf.

Abstract Everyday speech is abundant with disfluencies. However, little is known about their roles in speech communication. We examined the effects of filled pauses at phrase boundaries on native and non-native listeners in Japanese. Study of spontaneous speech corpus showed that filled pauses tended to precede relatively long and complex constituents. We tested the hypothesis that filled pauses biased listeners' expectation about the upcoming phrase toward a longer and complex one. In the experiment participants were presented with two shapes at one time, one simple and the other compound. Their task was to identify the one that they heard as soon as possible. The speech stimuli involved two factors: complexity and fluency. As the complexity factor, a half of the speech stimuli described compound shapes with long and complex phrases and the other half described simple shapes with short and simple phrases. As the fluency factor phrases describing a shape had a preceding filled pause, a preceding silent pause of the same length, or no preceding pause. The results of the experiments with both native and non-native listeners showed that response times to the complex phrases were significantly shorter after filled or silent pauses than when there was no pause. In contrast, there was no significant difference between the three conditions for the simple phrases, supporting the hypothesis.

Keywords DiSS

Yelena Yasinnik, Stefanie Shattuck-Hufnagel, and Nanette Veilleux, “Gesture marking of disfluencies in spontaneous speech,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 173-178. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_173.pdf.

Abstract Speakers effectively use both visual and acoustic cues to convey information in speech. While earlier research has concentrated on the association of visual cues (provided by gestures) with fluent prosodic structure, this study looks at the relationship between visual cues, prosodic markers and spoken disfluencies. Preliminary results suggested that speakers preferentially perform gestures in the eye region in spoken disfluencies, but a more careful frame-by-frame analysis capturing all gestures revealed that movements of the eye region (blinks, frowns, eyebrow raises and changes in direction of eyegaze) occur with high frequency in both fluent and non-fluent speech. The paper describes a method for frame-by-frame labelling of speech- accompanying gestures for a speech sample, whose output can then be combined with independently derived labels of the prosody. Initial analysis of 3 minute samples from two speakers reveals that one speaker produces eye movements in association with disfluencies and the other does not, and that this tendency does not result from alignment of brow gestures with pitch accents.

Keywords DiSS

Yuan Zhao, and Dan Jurafsky, “A preliminary study of Mandarin filled pauses,” in The 4th Workshop on Disfluency in Spontaneous Speech, Aix-en-Provence, France, September 2005, pp. 179-182. http://www.isca-speech.org/archive_open/archive_papers/diss_05/dis5_179.pdf.

Abstract The paper reports preliminary results on Mandarin filled pauses (FPs), based on a large speech corpus of Mandarin telephone conversation. We find that Mandarin intensively uses both demonstratives (zhege 'this', nage 'that') and uh/ mm as FPs. Demonstratives are more frequent FPs and are more likely to be surrounded by other types of disfluency phenomena than uh/mm, as well as occurring more often in nominal environments. We also find durational differences: FP demonstratives are longer than non-FP demonstratives, and mm is longer than uh. The study also revealed dialectal influence on the use of FPs. Our results agree with earlier work which shows that a language may divide conversational labor among different FPs. Our work also extends this research in suggesting that different languages may assign conversational functions to FPs in different ways.

Keywords DiSS

Filled Pause

Research Center

Filled Pause

Research Center

Filled Pause

Research Center

The 4th Workshop on Disfluency in Spontaneous Speech (DiSS 2005)

Papers presented