Many recent publications in audio research present subjective evaluations of audio quality based on the Recommendation ITU-R BS.1534-1 (MUSHRA, MUltiple Stimuli with Hidden Reference and Anchor). This is a very welcome trend because it enables researchers to assess the implications of their developments. The evaluation of listening tests, however, sometimes sufers from an incomplete understanding of the underlying statistics. The present paper aims at identifying the causes for the pitfalls and misconceptions in MUSHRA evaluations. It exemplifes the impact of falsely used or even misused statistics. Subsequently, schemes for evaluating the listeners' judgments that are well-grounded on statistical considerations comprising an understanding of the concepts of statistical power and efect size are proposed.
Authors:
Nagel, Frederik; Sporer, Thomas; Sedlmeier, Peter
Affiliations:
Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany; Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany; Technical University of Dresden, Dresden, Grmany(See document for exact affiliation information.)
AES Convention:
128 (May 2010)
Paper Number:
8146
Publication Date:
May 1, 2010
Subject:
Listening Tests and Evaluation Psychoacoustics
Click to purchase paper as a non-member or you can login as an AES member to see more options.
No AES members have commented on this paper yet.
To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.
If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.