Community

AES Conference Papers Forum

Phonetic-Oriented Identification of Twin Speakers Using 4-Second Vowel Sounds and a Combination of a Shift-Invariant Phase Feature (NRD), MFCCs, and F0 Information

Document Thumbnail

Automatic speaker identi?cation typically relies on sophisticated statistical modeling and classi?cation which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classi?cation, and constructive combination of congruent classi?cation scores. We use phase, spectral magnitude and F0-related features in speaker identi?cation experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classi?cation score fusion. Insights for further research are also presented.

Author:
Affiliation:
AES Conference:
Paper Number:
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society