State of the art audio encoders are based on transform-domain coding algorithms. Due to time-frequency uncertainty, transform domain coders suffer from ?pre-echo? and ?diffusion? artifacts during transient portions of the signal. These artifacts occur because of large transform lengths used to achieve higher coding gains. Audio encoders employ various tools such as adaptive transform lengths, TNS etc to efficiently code transient portions of the audio signal. Typically audio signals have time domain transients (e.g. castanets), frequency domain transients (e.g. flute, clarinet) and transients observed in speech signals during consonant to vowel transitions etc. Identification of these transients in an audio signal is vital to achieve perceptual quality at low bit-rates. This paper discusses the various transient classes present in audio signals, apart from describing a transient detector employed for efficient modeling of all classes of transients. The proposed transient detector has been incorporated in MPEG-4 AAC encoder, independent of the psycho-acoustic analysis methodology used. Listening tests as well as OPERA scores indicate substantial improvement in audio quality, over the baseline encoder.
Authors:
Suresh Babu, Venkata; Malot, Ashish Kumar; Vijayachandran, V.M.; Vinay, M.K.
Affiliation:
Multimedia Technologies division, Emuzed India Pvt. Limited, Bangalore, India
AES Convention:
116 (May 2004)
Paper Number:
6175
Publication Date:
May 1, 2004
Subject:
Low Bit-Rate Audio Coding
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.