AR/VR applications commonly face difficulties binaurally spatializing many sound sources at once because of computational constraints. Existing techniques for efficient binaural rendering, such as Ambisonics, Vector-Based Amplitude Panning, or Principal Component Analysis, alleviate this issue by approximating Head-Related Transfer Function (HRTF) datasets with a linear combination of basis filters. This paper proposes a novel binaural renderer that convolves each basis filter with a layer of low-order finite impulse response filters applied in time-domain and derives both the spatial functions and filter coefficients through the minimization of a perceptually motivated error function. In a MUSHRA test, expert listeners had more difficulty differentiating the proposed method from the HRTF dataset it approximates than it did with existing methods configured with an equivalent number of Fast Fourier Transforms and identical HRTF preprocessing. This was consistent across both an internal Microsoft HRTF dataset and an individual head from the SADIE database.
Authors:
Marchan, Mick; Allen, Andrew
Affiliation:
Microsoft, Redmond, WA
JAES Volume 71 Issue 6 pp. 338-348; June 2023
Publication Date:
June 3, 2023
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.