back to

Seminar: 2/8 - Robert Remez

11:00 to 12:30 PM      at:  

Department of Psychology
Columbia University
"The Perceptual Organization of Speech, or,
How the cocktail party problem became the problem of cocktails for two."

How does a listener know what a talker just said? A fundamental perceptual component in acts of spoken communication is the analysis of sensory samples of speech. However, perception of the phonetic properties in stimulation cannot proceed as if sensory activity stems from speech sources alone. We speak and listen to each other amid multiple sources of sound. Indeed, the vocal apparatus itself is a source of respiratory and ingestive sound as well as speech. In this respect, the perception of speech naturally entails two functions: 1) an organizational function that identifies a sensory pattern attributable to a spoken source; and, 2) an analytical function that identifies the phonetic attributes conveyed in a sensory pattern. Traditional accounts of each function rely on the similarity of sensory samples to perceptual standards designated as the most likely sensory effects of consonants and vowels. Studies of sinewave replicas of speech undermine this conceptualization, because intelligible sinewave signals are not similar to vocally produced sound; likewise, sinewave signals are not familiar to listeners, neither as auditory forms nor as phonetic sequences. This evidence supports a conclusion about the boundary conditions on a perceptual explanation of speech: Early sensory coding is exquisitely sensitive to coarse-grain spectrotemporal properties of the signal independent of momentary or likely sensory effects.