Thursday, May 11, 2006

More on Speech Audio quality for Speech Recognition

Audio quality is a critical factor in Speech Recognition performance. How do we define audio quality? Or to put it another way, what do we mean when we say that good quality audio is being received by the speech recognition software (or, for that matter, by the listener at the other end of an audio communications channel)?

One of the indicators of "good" audio is how closely it resembles the input audio that was intended to be transmitted. In case of speech recognition and telephone/VoIP communication, this input audio is the set of utterances spoken by the user(s) into the microphone. In more technical terms of the audio system, this can be described as having a "flat frequency response". It implies that the components of the audio signal at all the frequencies are preserved intact in the output. Speech recognition software relies on the spectrum of the audio signal to determine "what" was spoken, so the more closely the audio signal's frequency components resemble those of the user's spoken audio, the more accurate the recognition will be.

0 Comments:

Post a Comment

<< Home