Blind upmix denotes the process of converting audio content into a higher number of output channels without the aid of any prior spatial information. This is often needed for upmixing legacy monophonic recordings into modern multichannel audio formats. Especially in live-recordings, applause plays a vital role. However, creating a convincing blind upmix of applause signals is a demanding task. Applause can be interpreted as a superposition of distinctive and individually perceivable foreground claps and a more noise-like background. While the background signal can be upmixed by applying decorrelation and distribution across channels, it is important that the foreground claps are spatially distributed in a perceptually meaningful and plausible manner. This paper investigates the effect of the spatial, temporal, and timbral structure of foreground claps on the perceived plausibility of applause signals. The assessment was done by means of two listening tests. Results show that especially for sparse applause, plausibility is significantly reduced if its natural timbral and temporal structure is corrupted.
Adami, Alexander; Brand, Lukas; Herre, Jürgen
Affiliation: International Audio Laboratories Erlangen, Erlangen, Germany
AES Convention: 142 (May 2017) Paper Number: 9750
Publication Date: May 11, 2017
Subject: Audio Processing and Effects
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.