The goal of speech enhancement is to make speech more pleasant and understandable, improving one or more perceptual aspects of speech, such as quality or intelligibility. This paper addresses single-channel speech enhancement. The authors explore improved multiband spectral subtraction based on the equivalent rectangular bandwidth (ERB) scale. In the proposed algorithm, the full speech spectrum is divided into different nonuniform frequency bands, and spectral subtraction is performed separately in each band. Moreover, subband spectral entropy is used directly to do the noise estimation rather than using speech endpoint detection. The ERB scale is adopted in the subband spectral entropy instead of the traditional linear scale or the Bark scale. The subband spectral entropy based on ERB scale can obtain a more accurate noise estimation, which can achieve better single-channel speech enhancement. The speech spectrograms, objective measures, and informal subjective listening tests show that the remnant noise is suppressed more by the proposed algorithm than by the Upadhyay’s algorithm.
Wei, Yi; Zeng, Yumin; Li, Chen
Affiliations: School of Physics and Technology Nanjing Normal University, Nanjing, China; Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing, China; State Key Laboratory Cultivation Base of Geographical Environment Evolution (Jiangsu Province), Nanjing, China; Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing, China(See document for exact affiliation information.)
JAES Volume 66 Issue 3 pp. 100-113; March 2018
Publication Date: March 19, 2018
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.