Filled Pause
Research Center

Filled Pause
Research Center

Filled Pause
Research Center

Investigating 'um' and 'uh' and other hesitation phenomena

Investigating 'um' and 'uh' and other hesitation phenomena

Investigating 'um' and 'uh' and other hesitation phenomena

March 9th, 2015

Presenting about a new java application for second language fluency development

I had a really great winter vacation: I spent most of it coding! All right, so that's a bit nerdy, but I finally set myself down to work on a project I'd been thinking about for several years. The basic idea is that I've been wanting to see an application that gives some kind of real-time feedback to a second language learner while they are speaking. There are many applications that can give latent feedback, some as early as moments after a pre-set sentence is spoken. But I can't find any that give immediate feedback (or nearly immediate). Of course, some ideas for using speech recognition technology for second language speech practice are good and the feedback is close to real-time (often 1-2 seconds latency). But I have wanted to see about the possibility of immediate feedback that would be comparable to the kind of audiovisual feedback one would get from an interlocutor during a face-to-face conversation.

In my case, what I wanted is to give feedback about fluency characteristics. In other words, how fast/slow someone is speaking, how long they are pausing, whether they are pausing too much, how many filled pauses they say, and so on. I believe this doesn't require full speech recognition, but more basic digital signal processing. So, I created a java application that does this. At this point, I've only created the fluency factors detection engine, so the user interface is pretty spartan. I envision something more user-friendly for actual student use. But anyway, I wanted to see if this could be done: So this was kind of a proof-of-concept exercise for me.

In short, I succeeded ... mostly. With a good headset, computer sound card, and quiet background conditions, Fluidity (that's the name I gave it) will detect the speech rate (counting syllables, not words), and the difference between silence and speech. Based on this information it can then provide feedback as articulation rate, silent pause rate, silence duration, and phonation time. I also tried to get a filled pause detection mechanism working, but this did not work reliably with other voices (on my home computer with my voice, it works all right). I also did a user test of it with a group of students and got fairly positive reactions as well as good suggestions for how to improve the Fluidity in the future.

Main window of Fluidity application by Ralph Rose

I introduced Fluidity at the English Language Education Society of Japan (JELES) annual meeting in March. I was asked whether the application would be made publicly available. I will. But I need some more time to get it into a condition for distribution. I will be sure to announce it here when it's ready. For now, though, feel free to look at my slides for more information about it.

[Note: This post was published in August 2015 but has been dated in order to reflect the actual timing of the events described here.]