AES Journal Forum

An Objective Audio Quality Measure Based on Power and Envelope Power Cues

Document Thumbnail

The generalized power spectrum model (GPSM), which has been shown to account for a large number of psychoacoustic and speech intelligibility (SI) experiments, was extended to assess audio quality. Like the GPSM, the suggested audio quality model, GPSMq, combines features from the power spectrum model (PSM) and envelope power-spectrum model (EPSM). GPSMq utilizes signal-to-noise ratios (SNRs) in the power and envelope power domains to model the addition or removal of energy by the signal processing under test. Four audio quality databases that introduce linear and nonlinear distortions to music and speech signals were assessed to cover a large variety of distortions cases. GPSMq provided better overall prediction performance than other state-of-the-art auditory-model-based objective quality measures. The results demonstrate that the power and envelope power SNR metric is appropriate for predicting audio quality for a variety of signal distortions in addition to psychoacoustics and SI. This supports the notion that the auditory system extracts a universal set of auditory features to be analyzed in a task-dependent decision stage.

JAES Volume 66 Issue 7/8 pp. 578-593; July 2018
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

No AES members have commented on this paper yet.

Subscribe to this discussion

RSS Feed To be notified of new comments on this paper you can subscribe to this RSS feed. Forum users should login to see additional options.

Start a discussion!

If you would like to start a discussion about this paper and are an AES member then you can login here:

If you are not yet an AES member and have something important to say about this paper then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society