Community

AES Conference Papers Forum

Deep Learning Based Voice Extraction and Primary-Ambience Decomposition for Stereo to Surround Upmixing

Document Thumbnail

Surround systems have gained popularity in home entertainment despite the fact that most of the cinematic content is delivered in two-channel stereo format. Although there are several upmixing options, it has proven challenging to deliver an upmixed signal that approximates the original directionality and timbre intended by the mixing artist. The aim of this work is to design a two-to-five channels upmixer using a novel upmixing strategy combining voice extraction and primary-ambience decomposition. Results from a modified-MUSHRA test show that our proposed upmixer outperforms established alternatives for cinematic upmixing in perceived spatial and timbral quality.

Open Access

Open
Access

Authors:
Affiliations:
Express Paper 62; AES Convention 154; May 2023
Publication Date:
Subject:


Download Now (886 KB)

This paper is Open Access which means you can download it for free.

No AES members have commented on this Neural Networks yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this Neural Networks you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this Neural Networks and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this Neural Networks then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society