Single-channel speech source separation (SCSSS) is a research field with applications that include hearing aids and security. This research uses a hybrid method for SCSSS, which combines two different approaches based on the voicing state; the algorithm can be used for speech source separation and speech enhancement. The hybrid method combines subspace decomposition for unvoiced speech, and Soft-CASA (Computational Auditory Scene Analysis) for voiced speech. The voiced speech source separation process is an improved version of the conventional CASA system that is optimized by the use of a soft mask. Moreover, the unvoiced speech source separation process relies on an optimized approximation of the speech signal by subspace decomposition in the spectral domain. The new system is evaluated for speech separation outcome, as well as for voicing decision. Despite the challenging acoustic environments that were used for test, the proposed speech separation approach yields on average 58.91 % improvement in signal-to-interference ratio, 12.67 % improvement in signal-to-artifact ratio, 38.91 % improvement in signal-to-distortion ratio, and 45 % improvement in perceived speech quality.
Wiem, Belhedi; Anouar, Ben Messaoud Mohamed; Aicha, Bouzid
Affiliation: Université de Tunis El Manar, National School of Engineers, Electric Department. Le Belvédère, Tunis, Tunisia
JAES Volume 66 Issue 12 pp. 1041-1050; December 2018
Publication Date: December 20, 2018
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.