This work presents a spectrogram factorisation method applied to automatic music transcription of a cappella performances with multiple singers. A variable-Q transform representation of the audio spectrogram is factorised with the help of a 6-dimensional sparse dictionary which contains spectral templates of vowel vocalizations. A post-processing step is proposed to remove false positive pitch detections through a binary classifier, where overtone-based features are used as input. Preliminary experiments have shown promising multi-pitch detection results when applied to audio recordings of Bach Chorales and Barbershop music. Comparisons made with alternative methods have shown that our approach increases the number of true positive pitch detections while the post-processing step keeps the number of false positives lower than those measured in comparative approaches.
Authors:
Schramm, Rodrigo; Benetos, Emmanouil
Affiliations:
Federal University of Rio Grande do Sul (UFRGS), Brazil; Queen Mary University of London, London, UK(See document for exact affiliation information.)
AES Conference:
2017 AES International Conference on Semantic Audio (June 2017)
Paper Number:
3-2
Publication Date:
June 13, 2017
Subject:
Pitch Tracking
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can
subscribe to this RSS feed.
Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.