Community

AES Journal Forum

Source—Filter Modeling in the Sinusoid Domain

(Subscribe to this discussion)

Document Thumbnail

A theory for modeling sound sources as a combination of a source and filter parts is useful in timbre analysis and synthesis. In this model the sound is represented as a sparse sum of sine waves, each of which is characterized by a slowly varying amplitude and frequency. Two methods are considered for deriving the parameters of the model: slow variation (SV) and filter bank (FB). The FB method better follows the global spectral envelope, while the SV method better follows local amplitude–frequency dependencies. Real and synthetic sounds are analyzed in detail.

Authors:
Affiliation:
JAES Volume 58 Issue 10 pp. 795-808; October 2010
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

(Comment on this paper)

Comments on this paper

Default Avatar
Krishnamoorthy P


Comment posted November 24, 2010 @ 20:25:19 UTC (Comment permalink)

Dear Authors:

The article is really good. Is this source filter modelling based on sinusoidal analysis can be used for some kind of audio classification work? Like Speech-music classification. Generally the nature of the sinusoidal components will differ for these two signals.


Default Avatar
Author Response
Xue Wen


Comment posted December 12, 2010 @ 16:25:36 UTC (Comment permalink)

Unlike usual audio features directly computable from waveform/spectrum and representing the whole audio in some way, the harmonic sinusoidal model only catches the features of ONE pitched component in the sound, so that it is not advisable to use any variable computed from this model as a feature of the whole audio. Take speech/music classification as an example, a feature computed from the harmonic sinusodal model will be useful to tell if a harmonic component within the sound is more likely to be voiced speech or some musical instrument note, but will not tell you whether the sound, as a mixture of various components, is speech or music, unless the sound is monophonic. At Munich convention last year we had demonstrated how we transform one spoken phoneme into another or transform speech to musical instrument and back. This indicates that the model hosts enough information for phoneme or instrument classification.

By the way, sound examples used in this paper can be found by clicking here.


Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Join this discussion!

If you would like to contribute to the discussion about this paper and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society