A novel dictionary learning approach that utilizes Mel-scale frequency warping in detecting overlapped acoustic events is proposed. The study explored several dictionary learning schemes for improved performance of overlapping acoustic event detection. The structure of NMF for calculating gains of each event was utilized for including in overlapped signal for its low computational load. In this paper, we propose a method of frequency warping for better sound representation, and apply dictionary learning by a holistic-based representation, namely nonnegative K-SVD (NK-SVD) in order to resolve a basis sharing problem raised by part-based representations. By using Mel-scale in a dictionary learning, we show that the information carried by low frequency components more than high frequency components and dealt with a low complexity. Also, the proposed holistic-based representation method avoids the permutation problem between another acoustic events. Based on these benefits, we confirm that the proposed method of Mel-scale with NK-SVD delivered significantly better results than the conventional methods.
Authors:
Choi, Hyeonsik; Lee, Keunsang; Keum, Minseok; Han, David; Ko, Hanseok
Affiliations:
LG Electronics., South Korea ; SELVAS AI, South Korea ; Army Research Laboratory, Adelphi, MD, USA; Korea University, Seoul, South Korea(See document for exact affiliation information.)
AES Convention:
149 (October 2020)
Paper Number:
10395
Publication Date:
October 22, 2020
Subject:
Audio Processing
Download Now (428 KB)
This paper is Open Access which means you can download it for free.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.