Community

AES Journal Forum

Source—Filter Modeling in the Sinusoid Domain

A theory for modeling sound sources as a combination of a source and filter parts is useful in timbre analysis and synthesis. In this model the sound is represented as a sparse sum of sine waves, each of which is characterized by a slowly varying amplitude and frequency. Two methods are considered for deriving the parameters of the model: slow variation (SV) and filter bank (FB). The FB method better follows the global spectral envelope, while the SV method better follows local amplitude–frequency dependencies. Real and synthetic sounds are analyzed in detail.

Authors: Wen, Xue; Sandler, Mark
Affiliation: School of Electronic Engineering and Computer Science, Centre for Digital Music, Queen Mary University of London, London, UK
JAES Volume 58 Issue 10 pp. 795-808; October 2010
Publication Date: November 15, 2010

Click to purchase paper as a non-member or you can login as an AES member to see more options.

(Comment on this paper)

Comments on this paper

Krishnamoorthy P	Comment posted November 24, 2010 @ 20:25:19 UTC (Comment permalink) Dear Authors: The article is really good. Is this source filter modelling based on sinusoidal analysis can be used for some kind of audio classification work? Like Speech-music classification. Generally the nature of the sinusoidal components will differ for these two signals. (Respond to this comment)

Author Response Xue Wen	Comment posted December 12, 2010 @ 16:25:36 UTC (Comment permalink) Unlike usual audio features directly computable from waveform/spectrum and representing the whole audio in some way, the harmonic sinusoidal model only catches the features of ONE pitched component in the sound, so that it is not advisable to use any variable computed from this model as a feature of the whole audio. Take speech/music classification as an example, a feature computed from the harmonic sinusodal model will be useful to tell if a harmonic component within the sound is more likely to be voiced speech or some musical instrument note, but will not tell you whether the sound, as a mixture of various components, is speech or music, unless the sound is monophonic. At Munich convention last year we had demonstrated how we transform one spoken phoneme into another or transform speech to musical instrument and back. This indicates that the model hosts enough information for phoneme or instrument classification. By the way, sound examples used in this paper can be found by clicking here. (Respond to this comment)

Subscribe to this discussion

To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Join this discussion!

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

Navigation

AES Journal Forum

Source—Filter Modeling in the Sinusoid Domain

Comments on this paper

Subscribe to this discussion

Join this discussion!

ABOUT AES

Contact Us