This paper proposes a sound source separation method using image signal processing and a microphone array. First, a spatio-temporal sound pressure distribution (STSPD) image is formed based on microphone outputs. Two-dimensional fast Fourier transform (2D FFT) transforms this image into a spectrum, in which sounds from different directions are separated into the components on different lines naturally. To separate sound sources, every line in the spectrum is extracted and 2D inverse FFT is applied. A method to restore a ?ne STSPD image from the sparse-microphone array is also proposed. Although the basic performance of the proposed method is comparable to a conventional delay and sum array, methods that are more sophisticated can be applied for improved performance.
Authors:
Ozawa, Kenji; Ito, Masaaki; Shimizu, Genya; Morise, Masanori; Sakamoto, Shuichi
Affiliations:
University of Yamanashi, Kofu, Yamanashi, Japan; Tohoku University, Sendai, Japan(See document for exact affiliation information.)
AES Conference:
2018 AES International Conference on Spatial Reproduction - Aesthetics and Science (July 2018)
Paper Number:
PP-6
Publication Date:
July 30, 2018
Session Subject:
Spatio-temporal sound pressure distribution image; Image signal processing; Two-dimensional fast Fourier transform; Sparse modeling using L1 regularization (Lasso)
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.