AES Journal Forum

Semantic Audio: Machines Get Clever with Music

Document Thumbnail

[Feature] The growth in computer power over the past decade has enabled remarkable possibilities for the automatic interpretation of audio signals. As human listeners we are able to make all sorts of conscious and unconscious interpretations of what we hear, from the recognition of instruments and voices within a complex texture through the extraction of melodic and chordal progressions to the inference of emotional mood or cultural associations. All of this is based on listening to a single mixed stream of sound that is just a messy waveform. If we are lucky there may be some spatial information involving the reception of more than one related stream from different directions, but at best we only have two ears no matter how many sources there are. Enabling machines to make sense of mixed audio streams was something close to the realms of science fiction not so long ago, but the latest research in semantic audio analysis brings it within our grasp.

JAES Volume 59 Issue 11 pp. 882-887; November 2011
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this feature yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this feature you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this feature and are an AES member then you can login here:

If you are not yet an AES member and have something important to say about this feature then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society