The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal the Mel- Frequency Cepstral Coefficients (MFCC) are used. The experiments are conducted employing the HTK engine (Hidden Markov Toolkit) for the Automatic Speech Recognition (ASR) task. The dictionary of 184 words was employed and WER (Word Error Rate), correctness and accuracy measures were calculated in order to verify and to compare obtained results of speech recognition.
Authors:
Bratoszewski, Piotr; Szykulski, Marcin; Czyzewski, Andrzej
Affiliation:
Gdansk University of Technology, Gdansk, Poland
AES Convention:
138 (May 2015)
eBrief:194
Publication Date:
May 5, 2015
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.