In the field of intelligent audio production, neural networks have been trained to automatically mix a multitrack to a stereo mixdown. Although these algorithms contain latent models of mix engineering, there is still a lack of approaches that explicitly model the decisions a mix engineer makes while mixing. In this work, a method to retrieve the parameters used to create a multitrack mix using only raw tracks and the stereo mixdown is presented. This method is able to model a multitrack mix using gain, panning, equalization, dynamic range compression, distortion, delay, and reverb with the aid of greybox differentiable digital signal processing modules. This method allows for a fully interpretable representation of the mixing signal chain by explicitly modeling the audio effects one may expect in a typical engineer's mixing chain. The modeling capacities of several different mixing chains are measured using both objective and subjective measures on a dataset of student mixes. Results show that the full signal chain performs best on objective measures and that there is no statistically significant difference between the participants' perception of the full mixing chain and reference mixes.
Colone, Joseph T; Reiss, Joshua
Affiliations: Centre for Digital Music, Queen Mary University of London, London, UK; Centre for Digital Music, Queen Mary University of London, London, UK(See document for exact affiliation information.)
JAES Volume 71 Issue 9 pp. 586-595; September 2023
Publication Date: September 13, 2023
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.