Camera zooming would be more compelling if the audio was subjected to a corresponding zoom that matched the video. Psychophysical and neuroimaging results suggest that a cross-modal approach to zooming facilitates multisensory integration. Because auditory distance perception is primarily determined by sound intensity, an audiovisual zoom effect can be obtained by matching the levels of different sources in a sound scene with their visually perceived distance. The authors propose a general theory for independent sound source level control that can be used to attain an acoustic zoom effect. The theory does not require sound source separation, which reduces computational load. An efficient implementation using fixed and adaptive spatial and spectral noise-reduction algorithms is proposed and evaluated. Experimental results using an array of a small number of low-cost microphones confirm that the proposed approach is particularly suited for consumer audiovisual capture applications.
van Waterschoot, Toon; Tirry, Wouter Joos; Moonen, Marc
Affiliations: KU Leuven, Department of Electrical Engineering, ESAT-SCD, Leuven, Belgium; NXP Software, Leuven, Belgium(See document for exact affiliation information.)
JAES Volume 61 Issue 7/8 pp. 489-507; July 2013
Publication Date: August 22, 2013
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.