Ambisonics is a promising spatial sound technique in augmented and virtual reality. In our previous study, we modeled the individual head-related transfer functions (HRTFs) using deep neural networks based on spatial principal component analysis. This paper proposes an individualized HRTF-based binaural renderer for the higher-order Ambisonics. The binaural renderer is implemented by filtering the virtual loudspeaker signals using individualized HRTFs. We perform subjective experiments to evaluate generic and individualized binaural renderers. Results show that the individualized binaural renderer has front-back confusion rates that are significantly lower than those of the generic binaural renderer. Therefore, we validate that using individualized HRTFs to convolve with those virtual loudspeaker signals to generate virtual sound at an arbitrary spatial direction still performs better than those using generic HRTFs. In addition, by measuring or modeling individual’s HRTFs in a small set of directions, our proposed binaural renderer system effectively predict individual’s HRTFs in arbitrary spatial directions.
Authors:
Zhang, Mengfan; Guan, Tianyi; Chen, Lianwu; Fu, Tianxiao; Su, Dan; Qu, Tianshu
Affiliations:
Key Laboratory on Machine Perception (Ministry of Education), Speech and Hearing Research Center, Peking University, China; Tencent AI Lab, Shenzhen, China(See document for exact affiliation information.)
AES Convention:
150 (May 2021)
Paper Number:
10454
Publication Date:
May 24, 2021
Subject:
HRTF
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.