Speech Perception: An Engineer's View

Thursday, November 05, 2009
6:00 PM
Human speech is a complex biological and social phenomenon. Historically, in among the morass of complexity, it has not always been clear precisely what is the explanandum (the phenomenon or phenomena to be explained) and the explanans (the explanation of this phenomenon or phenomena) for the scientific study of speech perception. This presentation will aim to answer these questions, setting out what a scientific model of speech perception ought to do. Along the way, after first making some remarks about black boxes and the Helmholtz-Thevenin theorem and its implications for the study of perception, and about "boxologies" in general, I will review the array of phenomena that have traditionally been taken as the explananda. These include categorical perception, duplex perception (including phonemic restoration), the supposed lack of invariance in the acoustic signal, auditory/visual illusions such as the McGurk effect, and perception of `weird' speech analogs like sine-wave speech. Much of the action in debating these has centred on the nature of the so-called "objects" of speech perception. Are they auditory, or articulatory or both or something else? As the field is characterised as much by what we don't know as what we do, I will conclude with some questions: How `discrete' is speech really? Is the concept of "object of perception" coherent? What is the best current model of speech perception? How good a model is a state-of-the-art automatic speech recogniser? I hope that this little tour of a fascinating area of science will be fun, and we will all learn at least something (including me).

Robert Damper

Professor of Speech Science and Speech Technology
School of Electronics and Computer Science, University of Southampton