[Feature] The growth in computer power over the past decade has enabled remarkable possibilities for the automatic interpretation of audio signals. As human listeners we are able to make all sorts of conscious and unconscious interpretations of what we hear, from the recognition of instruments and voices within a complex texture through the extraction of melodic and chordal progressions to the inference of emotional mood or cultural associations. All of this is based on listening to a single mixed stream of sound that is just a messy waveform. If we are lucky there may be some spatial information involving the reception of more than one related stream from different directions, but at best we only have two ears no matter how many sources there are. Enabling machines to make sense of mixed audio streams was something close to the realms of science fiction not so long ago, but the latest research in semantic audio analysis brings it within our grasp.
JAES Volume 59 Issue 11 pp. 882-887; November 2011
Publication Date: December 21, 2011
No AES members have commented on this feature yet.
If you are not yet an AES member and have something important to say about this feature then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.