Community

AES Convention Papers Forum

A Zero-Pole Vocal Tract Model Estimation Method Accurately Reproducing Spectral Zeros

Document Thumbnail

Model based speech coding consists in modeling the vocal tract as a linear time-variant system. The all-pole model produced by the computationally efficiency Linear Predictive Coding method provides a good representation for the majority of speech sounds. However, nasal and fricative sounds, as well as stop consonants, contain spectral zeros which requires the use of a zero-pole model. Roughly speaking, a zero-pole model estimation method typically does a non-parametric estimation of the vocal tract impulse response, and tunes the zero-pole model to fit this estimation in a square sense. In this paper we propose an alternative strategy. We tune the zero-pole model to directly fit the power spectrum of the speech signal in a logarithmic scale, to be consistent with the way the human ear perceives sounds. In this way, we avoid the error introduced by the vocal tract impulse response estimation and obtain a model that is more accurate at reproducing spectral zeros in a logarithmic scale. A drawback of the proposed method, however, is its computational complexity.

Authors:
Affiliations:
AES Convention: Paper Number:
Publication Date:
Subject:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society