Head-related transfer functions (HRTF) are used for creating the perception of a virtual sound source at horizontal angle ø and vertical angle ?. Publicly available databases use a subset of a full-grid of angular directions due to time and complexity to acquire and deconvolve responses. In this paper we build up on our prior research [5] by extending the technique to HRTF synthesis, using the IRCAM dataset, while reducing the computational complexity of the autoencoder (AE)+fully-connected-neural-network (FCNN) architecture by ˜ 60% using Bayesian optimization. We also present listening test results, demonstrating the performance of the presented approach, from a pilot study that was designed for assessing the directional cues of the proposed architecture.
Authors:
Bharitkar, Sunil G.; Mauer, Timothy; Wells, Teresa; Berfanger, David
Affiliations:
HP Labs., Inc., San Francisco, CA, USA; Prism Lab, HP, Inc., Vancouver, WA, USA(See document for exact affiliation information.)
AES Convention:
146 (March 2019)
Paper Number:
10162
Publication Date:
March 10, 2019
Subject:
Machine Learning: Part 1
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.