Most of recent studies on vocal detection in audio recordings typically focus either on the development of new features or on classification methods. The impact of training and test data is largely neglected, leading to weaknesses in the design of databases which do not cover differences of vocal techniques across music genres. In this paper, we compare approaches for singing voice detection on individual genres. For both methods with the best performance, we further investigate the impact of disjunct distribution of training and test tracks with regard to their genres. In particular, the tracks of electronic genres, which are barely contained in public databases for vocal recognition, contribute to a better classification performance identifying vocals in tracks of other genres.
Authors:
Scholz, Florian; Vatolkin, Igor; Rudolph, Günter
Affiliation:
Technische Universität Dortmund, Dortmund, Germany
AES Conference:
2017 AES International Conference on Semantic Audio (June 2017)
Paper Number:
P2-1
Publication Date:
June 13, 2017
Subject:
Semantic Audio
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.