Audio Banter - View Single Post - DBT a flawed method for evaluating Hi-Fi ?

**Jim Lesurf** · January 7th 05, 04:31 PM posted to uk.rec.audio

In article , Richard Wall
wrote:

Hence I would not share you blanket conclusion, as the 'solution' will
depend on what we are trying to decide, and how we proceed.

Some changes can be detected electronically if the material remains
in the digital domain but once the sound has left the speakers there
are too many variables. For our Hi-Fi club we still use what for us
is a less flawed method of a-b comparison (preferably blind) on
repetition of a range of tracks followed by a longer term evaluation
over the next few weeks. I look forward to any proof that DBT for
Hi-Fi has been validated.

I'm afraid that "proof" isn't really something that experimental
science provides. Experiments provide *evidence* in terms of results
which then have to be assessed by understanding the experimental
process actually applied in the specific case. We can then decide what
may have been established as either reliable or unreliable. Science is
not a matter of "proof", but of testing to see if a hypothesis is
supported or confounded by suitable tests.
Sorry thought that the above was sufficient to be defined as proof.

Afraid I don't know what "above" you are referring to.

The "Some changes..." sentence is a statment indicating that there are
variables including the effect of the room acoustic.

The "For out Hi-Fi club..." sentence describes briefly a method you use for
comparisons.

I don't seen how either of these "define" what you mean as "proof" that a
DBT has been "validated". You point out there are practical problems and
hence limitations on the reliability of any observational results. But then
change to requiring "proof" of something. I was pointing out that "proof"
is an inappropriate concept in this context if I have understood you
correctly.

When you say you use "a-b comparison (preferrably blind)" do you mean
ABX? I'd be interested to know what protocol and method you use and
feel is better than what Iain is proposing.
I cannot offer a better test protocol for Ian as I fear that they will
all be affected by the listener.

What do you mean by "affected" here?

The point of the ABX approach is that if the listener can't really tell the
difference between A and B (even if they *think* they can) then their
'identifications' of X as A or B will become randomised. If they can tell
to a limited extent, then the will identify 'correctly' slightly more often
than random chance. If the difference is obvious then they will identify
'correctly' almost every time. In each case, given enough tries we can then
assess the results in terms of statistical significance.

I agree that the test situation differs from just sitting down and enjoying
the music. But this is a matter of "what hypothesis do we wish to test?"

I remember attending one of the London shows where a supplier had an
amplifier with standard capacitors and "special" capacitors (Black Gates
??). The amp was connected to a pair of headphones and you had a switch
to change from A-B and once you had convinced yourself if there was a
difference a flap that when lifted showed which was which. I also
remember thinking that the "special" capacitors sounded slightly clearer
but at the price premium I was not about to try replacing all the ones
in my amp.

You said "difference" but then went on to say "slightly clearer". These are
not the same issue. The intent of ABX tests is that they provide a way to
test if we can tell a "difference" regardless of which we might prefer, or
what opinions we might form about the audible nature of the difference.

Also, in the test you describe, how do we know that the cap values were
identical, etc? How many times were you able to repeat the test with the
caps 'identities' randomised as to which was A and which was B? Unless you
do these sorts of things all you are saying is that I prefered this switch
setting to the other on this one occasion in this single set of
circumstances. Once. The problem is that this is not a very useful test for
anything other than what you preferred at that show at that time.

Our evaluation procedure is very rudimentary we start with the system
(say A) as is and listen for about 40 minutes, then listen to three
specific tracks before changing to component B We then listen to the
same three tracks. If a difference is significant it can usually be
heard by all attendies within the first few bars, however the opinion
as to if this represents an improvement is not always unanimous and not
always the same for all three of the tracks. If the general perception
has been of a benefit we usually listen for the rest of the evening in
the B configuration before finally returning to A to repeat the three
tracks. In some venues the equipment is in another room or away from
the listening area allowing some changes to be made or not made out of
view of the listeners. Whilst we try to keep the volume setting the
same this is not always possible and alcohol is partaken of . I am
sure most of the differences we hear are not due to component changes.

As I think you will be aware, the above is of limited value for various
reasons.

My big problem with the advocates of ABX/DBT is the opinion based on
their claimed tests that most components sound the same and that the
results obtained by these tests prove this.

Well, what you seem to be saying here is that you do not like the
implications of the results, and thus wish to reject the test method.

However look at this another way. The aim of ABX is simply to see if people
can show that - to given level of statistical confidence, and in a specific
set of circumstances - they can reliably show they hear and identify a
difference.

Failure does not "prove" in your own terms that the components "sound the
same" in absolute terms. What it may do is support the view that the tested
items, in the tested situation, produced differences that were too small to
show as having statistical reliability.

Thus there may be a small difference, or one that would show up more
clearly in circumstances that the test did not investigate.

However if you wish to propose such a hypothesis, then you really need to
produce some other reliable test that can be used to gather evidence that
would either support or confound your idea.

My experience is to the contrary in that upgrades I have made in CD
player, Amplifier and others still to my hearing sound like upgrades and
when the old component is slotted back into the system I can hear the
difference. I am not happy with simple a/b either as this can easily
create false positives and have found that if we spend a lot of time
switching from a to B and C backwards and forwards at the end of an
evening I am tired,

I think that ABX is intended to help with this as it helps descriminate
against false positives. However if the differences are so small, does it
matter?

What has however shocked me recently is mains cables which
I have always felt should make no difference at all. I now however have
a load of Kimber cables !!!

Afraid that my main (pun :-) ) requirement for mains cable is that it has
to reach from the distribution board to the unit. :-)

Beyond that, I don't think I've ever heard any differences I could ascribe
to mains cable. If I did, I would be worried about the PSU in the units in
question.

I have heard filters make a difference with kit that allowed rf or clicks
through, but that is quite a different issue.

Slainte,

Jim

--
Electronics http://www.st-and.ac.uk/~www_pa/Scot...o/electron.htm
Audio Misc http://www.st-and.demon.co.uk/AudioMisc/index.html
Armstrong Audio http://www.st-and.demon.co.uk/Audio/armstrong.html
Barbirolli Soc. http://www.st-and.demon.co.uk/JBSoc/JBSoc.html