The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal the Mel- Frequency Cepstral Coefficients (MFCC) are used. The experiments are conducted employing the HTK engine (Hidden Markov Toolkit) for the Automatic Speech Recognition (ASR) task. The dictionary of 184 words was employed and WER (Word Error Rate), correctness and accuracy measures were calculated in order to verify and to compare obtained results of speech recognition.
Bratoszewski, Piotr; Szykulski, Marcin; Czyzewski, Andrzej
Affiliation: Gdansk University of Technology, Gdansk, Poland
AES Convention: 138 (May 2015) eBrief:194
Publication Date: May 5, 2015
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.