Separating the singing voice from accompanying instruments is important in music information-retrieval systems, since it allows for such applications as melody extraction, lyrics recognition, and singer identity. The authors investigate effective methods for unsupervised separation of the singing voice, called H-Semantics (Hybrid Singing Extraction through Multiband Amplitude Enhanced Thresholding and Independent Component Subtraction). The proposed method adds time-domain separation to the previous work that was based on frequency-domain cepstral methods. The results indicate separation of approximately 8.5 dB signal-to-distortion ratio over the baseline.
Authors:
Sofianos, Stratis; Ariyaeeinia, Aladdin; Polfreman, Richard; Sotudeh, Reza
Affiliations:
University of Hertfordshire, Hatfield, Hertfordshire, UK; University of Southampton, Southampton, UK(See document for exact affiliation information.)
JAES Volume 60 Issue 10 pp. 831-841; October 2012
Publication Date:
November 26, 2012
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can
subscribe to this RSS feed.
Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.