AES Journal Forum

Clean Audio for TV broadcast: An Object-Based Approach for Hearing-Impaired Viewers

(Subscribe to this discussion)

Document Thumbnail

As the percentage of the population with hearing loss increases, broadcasters are receiving more complaints about the difficulty in understanding dialog in the presence of background sound and music. This article explores these issues, reviews previously proposed solutions, and presents an object-based approach that can be implemented within MPEG-H to give listeners control of their audio mix. An object-based approach to clean audio, combined with methods to isolate sounds that are important to the narrative and meaning of a broadcast has the potential to enable users to have complete control of the relative levels of all aspects of audio from TV broadcast. This approach was demonstrated at the University of Salford campus in 2013.

Open Access


JAES Volume 63 Issue 4 pp. 245-256; April 2015
Publication Date:

Download Now (855 KB)

This paper is Open Access which means you can download it for free.

(Comment on this paper)

Comments on this paper

James Wood
James Wood

Comment posted July 1, 2015 @ 15:42:05 UTC (Comment permalink)


     I have yet to read this paper, but it's about time that someone investigated this... not only for the hearing-impared, but for those of us with age-appropriate hearing that still use the DVR 'back' button (or even Closed Captioning™) to try to catch the dialog.  It's not a matter of technical issues, but it's actors who MUMBLE!  Today's audio-capture technology ought to be far-and-away better than what was available from the mid-'30s through the '50s, but there's NEVER a problem understanding EVERY WORD in motion pictures, archived recordings or transcriptions from that era.  Where are the 'dialog coaches' and speech clinitians?


Default Avatar
Robert Orban

Comment posted July 6, 2015 @ 01:48:44 UTC (Comment permalink)

 I agree with Jim’s comment. I find that I can understand older entertainment programming and local newscasts perfectly, but often have to turn on closed captions just to make sure that I catch all of the dialog in contemporary entertainment programming. I suspect that this problem is caused partly by almost ideal acoustics in sound mixing rooms (often with near-field monitoring) and familiarity with the script, which together cause post-production mixers to be overly optimistic about the intelligibility of their mixes, and partly by the extremely common use of single-ended dynamic noise reduction applied to production dialog recordings to make them usable without looping. One of the side effects of such DNR can be the suppression of low-energy consonants in speech, which can be important to intelligibility.

And of course, as Jim said, there is also the problem of mumbled dialog. A director, who is familiar with the script, may not perceive it as a problem.

I had an interesting conversation at NAB with an engineer from Norway (I believe), who said that their cinemas now often screen Norwegian language films with subtitles because there is no expectation that the dialog will actually be intelligible to the audiences in their native language!


Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Join this discussion!

If you would like to contribute to the discussion about this paper and are an AES member then you can login here:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society