AES Convention Papers Forum

A Machine Learning Approach to Detecting Sound-Source Elevation in Adverse Environments

Document Thumbnail

Recent studies have shown that Deep neural Networks (DNNs) are capable of detecting sound source azimuth direction in adverse environments to a high level of accuracy. This paper expands on these findings by presenting research that explores the use of DNNs in determining sound source elevation. A simple machine-hearing system is presented that is capable of predicting source elevation to a relatively high degree of accuracy in both anechoic and reverberant environments. Speech signals spatialized across the front hemifield of the head are used to train a feedforward neural network. The effectiveness of Gammatone Filter Energies (GFEs) and the Cross-Correlation Function (CCF) in estimating elevation is investigated as well as binaural cues such as Interaural Time Difference (ITD) and Interaural Level Difference (ILD). Using a combination of these cues, it was found that elevation to within 10 degrees could be predicted with an accuracy upward of 80%.

AES Convention: Paper Number:
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society