Community

AES Convention Papers Forum

A Generalized Subspace Approach for Multichannel Speech Enhancement Using Machine Learning-Based Speech Presence Probability Estimation

Document Thumbnail

A generalized subspace-based multichannel speech enhancement in frequency domain is proposed by estimating multichannel speech presence probability using machine learning methods. An efficient and low-latency neural networks (NN) model is introduced to discriminatively learn a gain mask for separating the speech and the noise components in noisy scenarios. Besides, a generalized subspace-based approach in frequency domain is proposed, where the speech power spectral density (PSD) matrix and the noise PSD matrix are estimated by short-term and long-term averaging periods, respectively. Experimental results show that the proposed method outperforms the existing NN-based beamforming methods in terms of the perceptual evaluation of speech quality score and the segmental signal-to-noise ratio improvement.

Authors:
Affiliations:
AES Convention: Paper Number:
Publication Date:
Subject:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society