A noise robust end point detection algorithm is proposed that could be used in real environment speech recognition. Inaccurate end point detection brings not only speech recognition performance reduction but also users’ tiredness. EPD algorithms based on energy level change or speech presence probability are vulnerable to high energy noises. After reducing much noise by auditory filter, one of human speech pronunciation characteristic, syllabic rate is used for checking if there is still speech component or not. The proposed algorithm shows much better performance in real environments like TV sound noise, café noise, etc.
Authors:
Jeong, Jae-hoon; Kwon, Min Seok; Lee, Seungyeol; Lee, Young Woo; Mori, Haruyuki; Cho, Namgook; Lee, Jae Won
Affiliation:
DMC R&D Center, Samsung Electronics Co., Suwon, Gyeonggi-do, Korea
AES Convention:
139 (October 2015)
eBrief:233
Publication Date:
October 23, 2015
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.