Community

AES Convention Papers Forum

Vocal Affects Perceived from Spontaneous and Posed Speech

This study examines listeners’ natural ability to identify an anonymous speaker’s emotions from speech samples with broad ranges of emotional intensity. This study aims to compare emotional ratings between posed and spontaneous speech samples and analyzes how basic acoustic parameters are utilized. The spontaneous samples were extracted from the Korean Spontaneous Speech corpus consisting of casual conversations. The posed samples with emotions (happiness, neutrality, anger, sadness) were obtained from the Emotion Classification dataset. Non-native listeners were asked to evaluate seven opposite pairs of affective attributes perceived from the speech samples. Listeners perceived fewer spontaneous samples as having negative valences. The posed samples had higher mean rating scores than those of the spontaneous speeches, only in negative valences. Listeners reacted more sensitively to the posed than spontaneous speeches in negative valence and had difficulty detecting happiness from the posed samples. The spontaneous samples perceived as positive had higher variance in pitch and higher maximum pitch than those perceived as negative. Contrastingly, the posed samples perceived as negative valences were positively correlated with higher values of the pitch parameters. These results can be utilized to assign specific vocal affects to artificial intelligence voice agents or virtual humans, rendering more human-like voices.

Open
Access

Authors: Oh, Eunmi; Suhr, Jinsun
Affiliations: Yonsei University; Yonsei University(See document for exact affiliation information.)
AES Convention: 155 (October 2023) Paper Number: 10671
Publication Date: October 25, 2023
Subject: Signal Processing

Download Now (3.8 MB)

This paper is Open Access which means you can download it for free.

No AES members have commented on this Signal Processing yet.

Subscribe to this discussion

To be notified of new comments on this Signal Processing you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you are not yet an AES member and have something important to say about this Signal Processing then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

Navigation

AES Convention Papers Forum

Vocal Affects Perceived from Spontaneous and Posed Speech

Subscribe to this discussion

Start a discussion!

ABOUT AES

Contact Us