In this paper we present a scheme for unsupervised extraction of sound objects or sources from a single recording containing a mixture of sounds. The separation/extraction procedure is performed by orthogonal projection of the mixed sound onto sub-spaces that are derived by clustering of transform coefficients, such as coefficients obtained by PCA or ICA. The clustering step reveals a residual non-linear grouping structure of the signal that is omitted by the linear transform. To achieve independence we are searching for partitioning that maximizes the mutual information between a component and a set to which it belongs. This information is obtained by considering a pairwise distance measure among all coefficients. Source separation experiments are reported in the paper.
Author:
Dubnov, Shlomo
Affiliation:
Department of Communication Systems Engineering, Ben Gurion University, Beer Sheva, Israel
AES Conference:
22nd International Conference: Virtual, Synthetic, and Entertainment Audio (June 2002)
Paper Number:
000252
Publication Date:
June 1, 2002
Subject:
Virtual, Synthetic and Entertainment Audio
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can
subscribe to this RSS feed.
Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.