Community

AES Convention Papers Forum

The Influence of Cross-Modal Interaction on Audio-Visual Speech Quality Perception

Document Thumbnail

Combined audio-visual services are expected to become a feature of the next generation of telecommunication systems. Current systems apply individual perceptually motivated data reduction to the audio and video. A system which applies perceptually motivated data reduction to the combined audio-visual content may provide greater data reduction, whilst maintaining high quality service over low bit-rate transmission media. To be able to design such bimodal codecs, and optimize overall performance, (through the use of low bit-rate codecs),it is important to understand the trade-offs between the quality of the audio and the video. This paper compares the results of a complimentary pair of subjective experiments designed to investigate the differences between visual speech and non-visual speech quality perception and quality mismatch. Conclusions are drawn about the variation in cross-modal interaction, and particularly speech quality perception, for different audio-visual content. The results obtained will be used in the design of a bi-model perceptual model.

Authors:
Affiliation:
AES Convention: Paper Number:
Publication Date:
Subject:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society