AR/VR applications commonly face difficulties binaurally spatializing many sound sources at once because of computational constraints. Existing techniques for efficient binaural rendering, such as Ambisonics, Vector-Based Amplitude Panning, or Principal Component Analysis, alleviate this issue by approximating Head-Related Transfer Function (HRTF) datasets with a linear combination of basis filters. This paper proposes a novel binaural renderer that convolves each basis filter with a layer of low-order finite impulse response filters applied in time-domain and derives both the spatial functions and filter coefficients through the minimization of a perceptually motivated error function. In a MUSHRA test, expert listeners had more difficulty differentiating the proposed method from the HRTF dataset it approximates than it did with existing methods configured with an equivalent number of Fast Fourier Transforms and identical HRTF preprocessing. This was consistent across both an internal Microsoft HRTF dataset and an individual head from the SADIE database.
Marchan, Mick; Allen, Andrew
Affiliation: Microsoft, Redmond, WA
JAES Volume 71 Issue 6 pp. 338-348; June 2023
Publication Date: June 3, 2023
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.