Speaker recognition systems can typically attain high performance in ideal conditions. However, significant degradations in accuracy are found in channel-mismatched scenarios. Non-stationary environmental noises and their variations are listed at the top of speaker recognition challenges. Gammtone frequency cepstral coefficient method (GFCC) has been developed to improve the robustness of speaker recognition. This paper presents systematic comparisons between performance of GFCC and conventional MFCC-based speaker verification systems with a purposely collected noisy speech data set. Furthermore, the current work extends the experiments to include investigations into language independency features in recognition phases. The results show that GFCC has better verification performance in noisy environments than MFCC. However, the GFCC shows a higher sensitivity to language mismatch between enrollment and recognition phase.
Authors:
Al-Noori, Ahmed; Li, Francis F.; Duncan, Philip J.
Affiliation:
University of Salford, Salford, Greater Manchester, UK
AES Convention:
140 (May 2016)
Paper Number:
9577
Publication Date:
May 26, 2016
Subject:
Human Factors and Interfaces
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.