AES Member Portal

AES Inside Track

Each month an industry expert highlights a topic of importance to the AES community.
Listen, Learn, and Connect with advances in technology and best practices in audio.

Semantic Analysis and Deep Learning

With the omnipresence of digital multimedia data, the processing, analysis, and understanding of such data by means of automated methods has become a central issue in engineering and computer science.

Semantic audio is concerned with:

  • analysing audio signals in order to infer semantically meaningful information that can be understood by humans
  • decomposing audio signals into semantic entities in order to enable facilitated handling, modification and interaction with these audio objects in an intuitive way
  • enabling a machine to process audio signal as human experts could do (a least for the simple and boring tasks)
  • 



Such methods are relevant for the following applications:

  • analysing music for automated recommendation services

  • automatic transcription, score following and source separation for personalised sound and interactive music education

  • managing large amounts of data in audio editing and production
  • 
 

  • new consumer applications including DJ, karaoke, and dialog enhancement software

Deep Learning is also omnipresent. It is a branch of machine learning that in recent years gave rise to developments that outperformed their predecessors by large margins. This happened in computer vision and natural language processing and then also in digital speech and audio signal processing, e.g. in speech recognition, speech synthesis, speech enhancement, dereverberation and blind source separation.

Christian Uhle

Curator: Christian Uhle

Christian Uhle is chief scientist in the Audio division of the Fraunhofer Institute for Integrated Circuits IIS. He received the Dipl.-Ing. and PhD degrees from the Technical University of Ilmenau, Germany, in 1997 and 2008, respectively. His research activities comprise automotive sound reproduction, semantic audio processing, blind source separation, dialog enhancement, digital audio effects and natural language understanding with neural networks.  He is a member of the AES and chairs the AES Technical Committee on Semantic Audio Analysis.

Semantic Analysis and Deep Learning Resources




AES Journal Spotlight Interview

In this month’s Journal Spotlight video Francis Rumsey interviews Jonathan Moore about his paper in the November AES Journal. Jonathan discusses diffuse signal processing for sound reinforcement.


CUED UP

AES - Audio Engineering Society