AES Convention Papers Forum

Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks

Document Thumbnail

In this paper we propose a method of extending monaural into stereophonic sound based on deep neural networks (DNNs). First, it is assumed that monaural signals are the mid signals for the extended stereo signals. In addition, the residual signals are obtained by performing the linear prediction (LP) analysis. The LP coefficients of monaural signals are converted into the line spectral frequency (LSF) coefficients. After that, the LSF coefficients are taken as the DNN features, and the features of the side signals are estimated from those of the mid signals. The performance of the proposed method is evaluated using a log spectral distortion (LSD) measure and a multiple stimuli with a hidden reference and anchor (MUSHRA) test. It is shown from the performance comparison that the proposed method provides lower LSD and higher MUSHRA score than a conventional method using hidden Markov model (HMM).

AES Convention: Paper Number:
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society