This paper describes a novel, low-cost method for combining time-frequency representations into a more sparse one. To this end, a new local quality measure that is based on an amplitude-weighted version of the so-called Hoyer sparsity is proposed. A detailed evaluation procedure that employs a dataset with nearly perfect f0 annotations of melodic signals and a set of white-noise pulses is adopted for assessing the time-frequency resolution attained. The proposed method is shown to produce state-of-the-art results among the existing combination methods in terms of energy concentration at frequency contours, onsets, and offsets, meeting the most desirable requirements: high time-frequency resolution, low computational cost, and the capability of combining representations with non-linear frequency scale.
Authors:
da Costa, Maurício do V. M.; Biscainho, Luiz W. P.
Affiliations:
Music Technology and Digital Musicology Lab, Institute for Musicology and Music Pedagogy, Osnabrück University, Osnabrück, Germany; Signal Multimedia and Telecommunications Laboratory, DEL/Poli & PEE/COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil(See document for exact affiliation information.)
JAES Volume 70 Issue 9 pp. 698-707; September 2022
Publication Date:
September 13, 2022
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.