Manually setting the level of each track of a multitrack recording is often the first step in the mixing process. In order to automate this process, loudness features are computed for each track and gains are algorithmically adjusted to achieve target loudness values. In this paper we first examine human mixes from a multitrack dataset to determine instrument-dependent target loudness templates. We then use these templates to develop three different automatic level-based mixing algorithms. The first is based on a simple energy-based loudness model, the second uses a more sophisticated psychoacoustic model, and the third incorporates masking effects into the psychoacoustic model. The three automatic mixing approaches are compared to human mixes using a subjective listening test. Results show that subjects preferred the automatic mixes created from the simple energy-based model, indicating that the complex psychoacoustic model may not be necessary in an automated level setting application.
Wichern, Gordon; Wishnick, Aaron; Lukin, Alexey; Robertson, Hannah
Affiliation: iZotope, Inc., Cambridge, MA, USA
AES Convention: 139 (October 2015) Paper Number: 9370
Publication Date: October 23, 2015
Subject: Recording and Production
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.