Binaural recording technology offers an inexpensive, portable solution for spatial audio capture. In this paper a full-sphere 2D localization method is proposed that utilizes the Model-Based Expectation-Maximization Source Separation and Localization system (MESSL). The localization model is trained using a full-sphere head related transfer function dataset and produces localization estimates by maximum-likelihood of frequency-dependent interaural parameters. The model’s robustness is assessed using matched and mismatched HRTF datasets between test and training data, with environmental sounds and speech. Results show that the majority of sounds are estimated correctly with the matched condition in low noise levels; for the mismatched condition, a “cone of confusion” arises with albeit effective estimation of lateral angles. Additionally, the results show a relationship between the spectral content of the test data and the performance of the proposed method.
Authors:
Hammond, Benjamin; Jackson, Philip J. B.
Affiliation:
University of Surrey, Guildford, Surrey, UK
AES Convention:
142 (May 2017)
Paper Number:
9771
Publication Date:
May 11, 2017
Subject:
Posters: Spatial Audio
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.