View Single Post
  #59 (permalink)  
Old September 23rd 04, 05:00 PM posted to uk.rec.audio
Stewart Pinkerton
external usenet poster
 
Posts: 3,367
Default Older seperates vs new system

On Thu, 23 Sep 2004 14:14:11 +0100, "Keith G"
wrote:


"John Corbett" wrote in message
...
In article , Stewart Pinkerton
wrote:

... level-matched time-proximate ABX (and ABChr) testing
has proven over many decades to be the *most* sensitive test
for audible differences in sound quality.


But typical ABX tests are often not as sensitive as they are thought to
be.
ABX is an elegant scheme for data collection, but data collection is only
part of an experiment. It is the entire experiment's sensitivity that
matters.
Recall that a test is *sensitive* for a difference if the test is likely
to detect that difference when the difference is present; a test is
*specific* (i.e., selective) if it is unlikely to report a false positive
result (when the difference is not present).

Here is an example:

If someone does not ever detect a difference, he will still get correct
answers on 50% of trials in the long run just by random guessing.
If someone always detects a difference, he will of course be able to score
100% correct. Often people take as a threshold the size of difference
where someone would get 75% correct answers in an infinite sequence or
repeated trials.

Consider a difference large enough that a certain listener would get the
correct answer for 90% of all trials (well above the 75% threshold).

If we did an ABX test with just one trial for that subject, the
sensitivity would be .90 but the chance of a false positive would be
.50---way too high.

So, we do a test with 16 trials with a passing score if the subject gets
at least 14 correct responses. Now the type 1 error risk is small ( .01)
but the sensitivity is only .79. In other words, we have made a more
specific test, but it is _less_ sensitive that a single-trial test!

A difference has to so large that subjects get correct answers on about
95% of individual trials before a 14-of-16 test is as sensitive as a
single-trial test.
(A 12-of-16 test is less specific, but far more sensitive.)




Didn't understand a word of that myself, but I await the response to it with
eager anticipation!


He's quite right statistically, but of course his point is irrelevant,
since there's no point in high sensitivity without high confidence -
just as in measurement, there's no point in high resolution without
high accuracy, viz the myth that 24-bit digital has *any* higher
resolution than 16-bit - if you're using analogue tape sources.
--

Stewart Pinkerton | Music is Art - Audio is Engineering