Each month an industry expert highlights a topic of importance to the AES community. Listen, Learn, and Connect with advances in technology and best practices in audio.
I started working on perceptual Audio Coding in the late 1980s when the field was still in its infancy. At the time, several things were happening that created the need and enabled us to revolutionize the audio signal representation.
First, the CD format was becoming the de-facto standard for high quality audio. To transmit a mere four minutes of CD-quality audio it would have taken almost ten hours at the then-typical modem speed of 9.6 kb/s. This raised the challenge that transmitting (or storing) a library of high-quality audio was nearly impossible. Second, digital signal processing (DSP) was advancing in leaps, CPUs were rapidly becoming faster, memory was becoming affordable, and powerful portable devices were starting to show up in the marketplace. Third, a body of psychoacoustics research data was becoming more accessible which showed us that, due to the limits of human hearing, there was a lot of irrelevant information in the CD format and thus the potential for large data rate savings.
Audio Coding is a field at the intersection of many disciplines that has flourished and in the past 30 years by leveraging advances in research and technology. By exploiting the advances in DSP to represent audio signals in ever more compact and efficient ways, applying heuristic models to identify irrelevant components, and optimizing distortion-rate trade-offs, audio coding made transmission/storage of high-quality audio a reality and also radically changed our approach to audio. Few of us would have dared imagine the revolutionary impact that audio coding would have on the general consumption of digital media.
Fast-forward to today—fast broadband connections and large cloud storage capacity are widely available; we are starting to watch ultra-high definition television, and we will shortly be communicating through 5G telephone networks. Do we still need to worry about compressing audio? I believe the answer is “yes!” Although bandwidth and storage are becoming abundant, we are demanding more audio channels, spatial control, customizability, immersive technology in smaller formats, and more ubiquity and availability from the audio we consume. Responding to these needs, work continues in many directions including 3D sound; immersive 6 Degree of Freedom (6 Dof) audio; and more device-neutral, personalized approaches to how we represent/render audio.
For this month's Inside Track I've compiled a set of links to papers and other resources to introduce you to some of the fundamental issues and challenges arising in audio coding. You don't need to be an audio codec designer to understand the basic principles behind this powerful technology and to ultimately apply it correctly.
Curator: Marina Bosi
Marina Bosi, a pioneer in the field of digital audio coding, has enjoyed a distinguished career as a researcher, leader, and educator in the fields of digital media technology, rights management, and licensing. A member of the research team that created Dolby Digital and the leader of the MPEG-2 AAC (the core coding technology used in Apple's iTunes, etc.) development for which she received the ISO/IEC 1997 award (Project Editor in the development of International Standard), Marina launched the first North American university course on perceptual audio coding and MP3 technology at Stanford University.
Marina was CTO of MPEG LA and, together with Leonardo Chiariglione, head of MPEG, cofounded the Digital Media Project, an organization promoting successful development, deployment of rights management and the use of digital media. Dr. Bosi holds several patents and publications and is author of the acclaimed textbook “Introduction to Digital Audio Coding and Standards” (Kluwer/Springer December 2003) which has been translated into Chinese and Korean for sale in those overseas markets. A past President and a Fellow of the Audio Engineering Society, Dr. Bosi co-chaired the first International Conference on High Quality Audio Coding and has received a number of awards including twice the AES Board of Governors Award.