Object-based audio (OBA) program material is challenging to distribute over low bandwidth channels and costly to render for thin clients. This research proposes a dynamic object-grouping solution that can represent a complex object-based scene as an equivalent reduced set of object groups while maintaining perceptually transparent rendering quality. This solution is a type of spatial coding. This paper introduces a real-time greedy simplification technique that addresses limitations of previous approaches by modeling spatial release from masking and distributing input objects into to multiple output groups. The core algorithm is extended to preserve other types of artistic metadata beyond object position. Results of perceptual tests show that this solution can achieve a 10:1 reduction in object count while maintaining high-quality audio playback and rendering flexibility at the endpoint. Spatial coding does not require perceptual coding of the objects’ audio essence but can be further combined with audio coding tools to deliver OBA content at low bit rates. This makes spatial coding a key component of an OBA production and distribution workflow. Object-based content creation, distribution, and rendering workflows require novel methods to process, combine, encode, and simplify complex auditory scenes to allow end-point rendering flexibility, efficiency, and adaptability as well as the means to cater for personalized experiences.
Breebaart, Jeroen; Cengarle, Giulio; Lu, Lie; Mateos, Toni; Purnhagen, Heiko; Tsingos, Nicolas
Affiliation: Dolby Laboratories
JAES Volume 67 Issue 7/8 pp. 486-497; July 2019
Publication Date: August 14, 2019
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.