A perceptually enhanced chroma feature extraction during the HE-AAC audio encoding process is proposed. Extraction of chroma features from the MDCT-domain spectra of the encoder and its further enhancement utilizing the perceptual model of the encoder is investigated. The main advantage of such a scheme is a reduced computational complexity when both chroma feature extraction and encoding is desired. Specifically, the system is designed to produce reliable chroma features irrespective of the block switching decision of the encoder. Three methods are discussed to circumvent the poor frequency resolution during short blocks. All proposed enhancements are evaluated systematically within a well-known state-of-the-art chord recognition framework.
Authors:
Fink, Marco; Biswas, Arijit; Kellermann, Walter
Affiliations:
Dolby Germany GmbH, Nuremberg, Germany; University of Erlangen-Nuremberg; University of Erlangen-Nuremberg, Erlangen, Germany(See document for exact affiliation information.)
AES Convention:
132 (April 2012)
Paper Number:
8637
Publication Date:
April 26, 2012
Subject:
Analysis and Synthesis and Content Management
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.