This work is licensed under a
Creative Commons Attribution
4.0 International License.
Because object-based audio is becoming an important framework for the representation of complex sound scenes, this research describes a series of experiments to determine a categorization framework for broadcast audio objects. Categorization is a fundamental human strategy for reducing cognitive load, and knowledge of these categories should be beneficial for the development of perceptually based representations and rendering strategies for object-based audio. In this study, 21 expert and non-expert listeners took part in a free card sorting task using audio objects from a variety of different types of program material. Hierarchical agglomerative clustering suggests that there are 7 general categories, which relate to sounds indicating actions and movement, continuous background sound, transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention-grabbing transient sounds. A three-dimensional perceptual space calculated via multidimensional scaling suggests that these categories vary along the dimensions of semantic content, continuous-transient, and presence-absence of people. The position of an audio object along the dimensions of the perceptual space relates to its perceived importance.
Woodcock, James; Davies, William J.; Cox, Trevor J.; Melchior, Frank
Affiliations: Acoustics Research Centre, University of Salford, Salford, United Kingdom; BBC R&D, Dock House, MediaCityUK, Salford, United Kingdom(See document for exact affiliation information.)
JAES Volume 64 Issue 6 pp. 380-394; June 2016
Publication Date: June 27, 2016
Download Now (610 KB)
No AES members have commented on this paper yet.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.