This paper presents some ideas for the appropriate management of every information source present in a generic speech or audio coder. This task becomes more necessary as coding structures get more complex, and an appropriate organization and processing of this information is a key point for an efficient implementation, in terms of complexity and quality. First, a data structure will be proposed, inspired by classic comprehension theories, which sorts the information into three different hierarchical levels. Based on this structure, a global sound encoder block diagram will be described. This model is based on blackboard models, commonly applied in speech recognition applications. Finally, it will be shown how an MPEG-2/4 AAC-LC coder can be considered as a particular case of the proposed model.
Authors:
Alexandre, Enrique; Pena, Antonio
Affiliation:
E.T.S.E. de Telecomunicacion, Universidad de Vigo, Vigo, Spain.
AES Convention:
116 (May 2004)
Paper Number:
6069
Publication Date:
May 1, 2004
Session Subject:
Spatial Perception; Processing and Analysis and Synthesis of Sound
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.