In this paper we describe an online system for broadcast content, which can detect sound scene changes with high accuracy. The system is unsupervised and does not require prior information on the segment classes. A scene change probability score is computed for each frame of the signal using a hybrid approach combining a model-based (Gaussian Mixture Model) with a distance-based (Hotelling’s T2-Statistic) segmentation method. The mixture model parameters are adapted online using the previous frames of the signal. Experiments on real recordings show that we can achieve more than 85% correct segment change detection with only 16% false detections.
Sevkin, Gökhan; Craciun, Alexandra; Bäckström, Tom
Affiliations: International Audio Laboratories, Erlangen, Friedrich- Alexander-Universität (FAU), Erlangen, Germany; Aalto University, Aalto, Finland(See document for exact affiliation information.)
AES Conference: 2017 AES International Conference on Semantic Audio (June 2017)
Paper Number: P1-2
Publication Date: June 13, 2017
Subject: Semantic Audio
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.