Each month an industry expert highlights a topic of importance to the AES community.
Listen, Learn, and Connect with advances in technology and best practices in audio.
We're trying a slightly different format for this month's Inside Track. As the Editor of the series, I'm going to introduce selected papers from the E-Library with a small commentary on each, as a set of pointers to get you started on some of the key issues in the field. There are also a few external links to useful resources.
While a lot has been said in recent years about sophisticated 3D audio involving lots of loudspeakers, it pays to revise some of the basics of two-channel stereophony. It's remarkable just how much you can do with only two channels, perhaps surprisingly including creating a sense of immersion or envelopment. It's also relatively simple! There are so many papers and resources on the topic that it would be impossible to include them all here, so my collection below highlights some interesting historical controversies on the subject of capture and reproduction. It seems to come down to what perceptual attributes are most valued in the resulting sound image, considering that two-channel stereophony using loudspeakers is almost certainly incapable of delivering fully immersive three-dimensional imaging of spaces and the sources within them. (Here I exclude so-called transaural stereo, which involves binaural reproduction over loudspeakers, coupled with crosstalk cancelling.) Alongside this there's the issue of what "spaciousness" is actually caused by. Finally there is the question of how we believe that the brain can interpret what's coming out of only two loudspeakers in a room with at least some reflections of its own thrown in.
I cannot help but note in relation to all of this, though, that most learned papers on stereophony seem to start from the basic assumption that people will make stereo recordings with a simple microphone pair that captures the entire scene. In commercial practice this is rarely the case, although the "stereo pair" can be a good starting point for acoustic recordings.
Blumlein Stereo Microphone Technique — and Author's Reply. (Michael Gerzon and John Woram, February 1976)
In this fascinating exchange, Michael Gerzon takes issue with John Woram about the merits of "Blumlein" stereo microphone techniques.
Stereo Microphone Techniques — Are the Purists Wrong? (Stanley Lipshitz, September 1986)
Stanley Lipshitz threw the cat among the pigeons in 1986 with this AES Journal feature article based on a presentation he'd given at the 78th Convention in Anaheim the previous year. He argues in approachable terms that spaced microphone techniques (which mainly deliver a spatial effect as a result of the time differences between microphones) are inherently flawed and that coincident pairs (which rely on level differences) have greater merit. Some people like to use spaced microphones because of attributes other than precise source imaging, though, so read the paper and make up your own mind.
Spaciousness and Localization in Listening Rooms and their Effects on the Recording Technique. (David Griesinger, April 1986)
David Griesinger argues here that loudspeakers and their positions make big differences to imaging and spaciousness in listening rooms, suggesting that some form of "spatial equalization" may be desirable to enhance the spaciousness of coincident recordings, while retaining good imaging. You could try his simple spatial equaliser (a form of stereo "shuffler" of sum and difference components) for yourself.
Comments on "Spaciousness and Localization in Listening Rooms and Their Effects on the Recording Technique.” (Stanley Lipshitz, David Griesinger and Michael Gerzon, December 1987)
In a series of short letters, there's an interchange between Lipshitz, Griesinger and Gerzon about whether such stereo shuffling equalizers affect the image width and introduce phasiness, which may be mistaken for spaciousness. Do they affect the image localization as well as the spatial impression, and is the approach universally useful or only to be applied selectively?
Unified Theory of Microphone Systems for Stereophonic Sound Recording. (Michael Williams, March 1987)
Here Michael Williams argues that none of the microphone pair arrangements for stereo recording can be considered either universal or optimal. You have to choose the most appropriate tool for the job, so to speak, taking into account the recording angle, the amount of reverberant sound, and the image localization.
On the Naturalness of Two-Channel Stereo Sound. (Gunther Theile, February 1991)
Gunther Theile bucked the prevailing trend over a period of many years by arguing that depth and space were lacking in a stereo image when coincident microphones and panpot techniques were used. He argued for head-related stereo techniques intended for loudspeakers, which had time and level differences between the channels similar to those experienced in natural listening or binaural recording—a criterion not met using either pure intensity or time-based stereophony. These ideas relied on an "association model" of stereo that he had worked out earlier in his PhD thesis, which for many years remained only available in German but which was eventually translated into English by one of my students, Tobias Neher. If you really want to understand Theile's "take" on stereo, you need to read his thesis first.
Prediction of Perceived Width of Stereo Microphone Setups. (Hans Riekehof-Bohmer and Helmut Wittek, May 2011)
Here the authors attempt a means of predicting the decorrelation of diffuse field information in a stereo image, in order to find out how different stereo microphone arrangements deal with the reverberant elements that give a scene its spaciousness.
My Search for the Ideal Stereo Loudspeaker. (Siegfried Linkwitz, August 2013)
Siegfried Linkwitz has proposed in the past that ideally we'd like the two loudspeakers used for stereo reproduction to "disappear,” so that we can hear only the stereophonic scene presented. "Stereo sound reproduction relies upon the creation of an illusion,” he has rightly said. In this paper presented at the AES Loudspeakers and Headphones Conference in 2013 he recounts his practical search for the optimum loudspeaker radiation pattern for loudspeakers to be used for stereo reproduction. He concludes that one with a frequency independent polar response comes out on top, because of the way that room reflections integrate with direct sound.
Image Assistant (Helmut Wittek)
Helmut Wittek's Image Assistant is a useful tool for working out the localization of sources picked up by different 2- and 3-channel stereo microphone techniques. It also predicts diffuse field correlation and can do some auralization of results.
MARRS — Microphone Array Recording and Reproduction Simulator. (Hyunkook Lee)
Hyunkook Lee's web-based app for simulating stereo microphone arrays offers some useful tools for the prediction of results from stereo microphone arrays. It includes a means for virtual rendering of the stereo scene so that you can hear the likely results.
MMAD — Multichannel Microphone Array Design. (Michael Williams)
The two-channel version of Michael Williams' web-based tool allows one to see the proposed angles and spacings between different microphone pairs, for various directivity patterns and recording angles.
This month Francis Rumsey interviews Federico Avanzini about a paper on image-guided HRTF selection, which he co-authored with Michele Geronazzo, Enrico Peruch, and Fabio Prandoni, published in the June AES Journal.
SEPTEMBER 2019