![]() |
Older seperates vs new system
On Sun, 19 Sep 2004 07:45:08 +0000 (UTC), "Alan Murphy"
wrote: "Stewart Pinkerton" wrote in message .. . On Fri, 17 Sep 2004 07:53:59 +0000 (UTC), "Alan Murphy" wrote: I'll bet you an even £1000 that on my system, playing my music, I can tell the difference between a Technics CD player SL-PG490 alone, and the same player with a Meridian DAC 203 optically connected, in more than 67% instances." ~~~~~~~~~~~ That's simple enough isn't it? What is it that you don't understand? What is confusing you? What I don't understand is why you're so terrified of simply setting the volumes to the same level. Scared to admit that you wasted money on the DAC? Your crude and patronising approach to subtle audio differences, offering a paltry sum of money to take part in an A/B test that masks all but gross differences, is doing the industry a disservice and is as distasteful as snake oil, IMHO. My approach is certainly not crude, it's the same method used by the Subjective Evaluation Group at Harman International (brand names include Mark Levinson, Madrigal and JBL, among others). ABX testing is used because it *reveals* the most subtle differences, which are often concealed by other methods, sighted listening being demonstrably useless for discriminating subtle differences. That's why Harman (and B&W, KEF etc etc) use it every day in product development. Its other great strength is that it demonstrates the nonexistence of *imagined* differences, such as those between cables, so that the development engineer can check if that 'audiophile approved' component really does make an audible difference, or if he just wanted it to. What is 'patronising' is your presumption (with no actual proof) that you really can hear a difference in sound *quality* between your player and your DAC. Since you think that £1,000 is a paltry sum (even though *you* suggested it), let's make it an even £10,000 wager. You've read the small print and are in complete agreement? What 'small print'? Levels are equalised to +/- 0.1dB, the test protocol is double-blind, and you need to score more than 20 correct out of 30 trials. That's it. Unless of course you're thinking of something fundamentally underhand, like turning the volume *way* up and just listening to the noise floors. That would of course be cynical and pointless, and no cigar. Another clown tried that game once, with amplifiers. If you think you can tell a difference in sound quality on music, then that's just fine, you can choose any partnering gear and any music you like, as you stated in your original challemge at the top of this post. All I ask is an *honest* comparison, as always. Let me know when you've remortgaged the house and I can arrange escrow. Cash available whenever you need the only glimpse of it you'll ever get, and when you've ponied up your end. Care to suggest a neutral test proctor? -- Stewart Pinkerton | Music is Art - Audio is Engineering |
Older seperates vs new system
On Sun, 19 Sep 2004 11:34:38 +0000 (UTC), Stewart Pinkerton
wrote: On Sun, 19 Sep 2004 07:45:08 +0000 (UTC), "Alan Murphy" wrote: "Stewart Pinkerton" wrote in message . .. On Fri, 17 Sep 2004 07:53:59 +0000 (UTC), "Alan Murphy" wrote: I'll bet you an even £1000 that on my system, playing my music, I can tell the difference between a Technics CD player SL-PG490 alone, and the same player with a Meridian DAC 203 optically connected, in more than 67% instances." ~~~~~~~~~~~ That's simple enough isn't it? What is it that you don't understand? What is confusing you? What I don't understand is why you're so terrified of simply setting the volumes to the same level. Scared to admit that you wasted money on the DAC? Your crude and patronising approach to subtle audio differences, offering a paltry sum of money to take part in an A/B test that masks all but gross differences, is doing the industry a disservice and is as distasteful as snake oil, IMHO. My approach is certainly not crude, it's the same method used by the Subjective Evaluation Group at Harman International (brand names include Mark Levinson, Madrigal and JBL, among others). ABX testing is used because it *reveals* the most subtle differences, which are often concealed by other methods, sighted listening being demonstrably useless for discriminating subtle differences. That's why Harman (and B&W, KEF etc etc) use it every day in product development. Its other great strength is that it demonstrates the nonexistence of *imagined* differences, such as those between cables, so that the development engineer can check if that 'audiophile approved' component really does make an audible difference, or if he just wanted it to. What is 'patronising' is your presumption (with no actual proof) that you really can hear a difference in sound *quality* between your player and your DAC. Since you think that £1,000 is a paltry sum (even though *you* suggested it), let's make it an even £10,000 wager. You've read the small print and are in complete agreement? What 'small print'? Levels are equalised to +/- 0.1dB, the test protocol is double-blind, and you need to score more than 20 correct out of 30 trials. That's it. For such a small statistical population, 20 out of 30 is extraordinarily generous, especially considering that the subject would be claiming something very close to 100%, particularly if they get to choose the listening material. Although in strict Chi squared terms it is a good score, Sod's law dictates that it can happen quite easily by chance. Having achieved the 20, it should be necessary to do it again to show it wasn't a fluke. d Pearce Consulting http://www.pearce.uk.com |
Older seperates vs new system
Don't they still sell the ATC kits?? Yes they do, Much better value than anything from the usual brand names etc IMO. Actually very simple to build, for a paint finish, put them to a car body spray shop to be filled sanded and painted. candy lacquers add amazing depth to paint jobs. Veneering can be tricky to do yourself without a vacuum press but a local cabinet maker would do it for you for a small fee. Stew. |
Older seperates vs new system
In article , Stewart Pinkerton
wrote: ... level-matched time-proximate ABX (and ABChr) testing has proven over many decades to be the *most* sensitive test for audible differences in sound quality. But typical ABX tests are often not as sensitive as they are thought to be. ABX is an elegant scheme for data collection, but data collection is only part of an experiment. It is the entire experiment's sensitivity that matters. Recall that a test is *sensitive* for a difference if the test is likely to detect that difference when the difference is present; a test is *specific* (i.e., selective) if it is unlikely to report a false positive result (when the difference is not present). Here is an example: If someone does not ever detect a difference, he will still get correct answers on 50% of trials in the long run just by random guessing. If someone always detects a difference, he will of course be able to score 100% correct. Often people take as a threshold the size of difference where someone would get 75% correct answers in an infinite sequence or repeated trials. Consider a difference large enough that a certain listener would get the correct answer for 90% of all trials (well above the 75% threshold). If we did an ABX test with just one trial for that subject, the sensitivity would be .90 but the chance of a false positive would be ..50---way too high. So, we do a test with 16 trials with a passing score if the subject gets at least 14 correct responses. Now the type 1 error risk is small ( .01) but the sensitivity is only .79. In other words, we have made a more specific test, but it is _less_ sensitive that a single-trial test! A difference has to so large that subjects get correct answers on about 95% of individual trials before a 14-of-16 test is as sensitive as a single-trial test. (A 12-of-16 test is less specific, but far more sensitive.) |
Older seperates vs new system
"Jim Lesurf" wrote in message
... In article , Alan Murphy wrote: "Don Pearce" wrote in message ... On Thu, 16 Sep 2004 08:48:58 +0000 (UTC), "Alan Murphy" wrote: Because the DAC is much quieter and I can identify it every time :-) What do you mean by quieter - less background noise or less volume? Both, but I was just trying to make a point really, Don, about the difficulty of establishing proper procedures when testing sensory descrimination. In the visual field, with which I am familiar, very slight alterations in test procedure, such as seperating contiguous samples by a few mm or so can decrease discrimination of colour difference by an order of magnitude. I can see the above as being a justification for equalising the output levels in order to remove a controllable variable - assuming that the concern is to see if differences of some other kind can be perceived. Assuming that your belief is that you can hear differences which do *not* come from sound level inequalities alone. Please see below as I'd like to clarify your point of view on this.... Presenting the samples, in series, in A/B fashion, further greatly decreases discrimination depending on the time interval between viewings. The above seems to me to be a generalised assertion - but I am not sure that it can be shown to always be reliable. I'd presume it would depend upon the circumstances and the manner of any 'difference' which the subjects are being tested for perception. The differences are still there of course but are masked by the method of testing. To say "masked" here seems to me to be presupposing that any difference was both due to some effect *other* than the level inequality and was then indeed 'masked' rather than being 'removed' by equalising the levels. Again, I'd like to clarify this below... Resort to instrumentation is not helpful in judging differences below about 5 - 10 jnd's, depending on position in colour space, due to the acuity of the visual system. I suspect the same holds true for auditory differences. My interest here is not in what you "suspect", but in trying to clarify your views and their implications. To do this I'd like to post a series of questions and I'd be interested in your answers. Apologies if what follows is not clearly expressed... Firstly, am I correct in understanding that you are saying that you would be willing to do a comparision test if there were no attempt to correct for measureble level differences, but would not be willing to take such a test if the levels were equalised beforehand? If the answer to the above is essentially "Yes", the next question is: Is this because you are confident you can hear a difference when the levels are different, but not when they have been equalised? If the answer to the above is essentially "Yes", the next question is: Is it your belief that the anticipated inability to perceive a difference is somehow "masked" (using your term) by level equalisation? If the answer to the above is essentially "Yes, the next question is: What test do you suggest that can be carried out to distinguish between the hypothesis that the failure to discriminate when the levels are equalised is due to "masking" rather than the anternative hypothesis that the only audible distinction was simply due to a difference in level? i.e. that the difference percieved was solely due to a level difference. If you cannot suggest a performable test that could falsify one hypothesis and support the other, how can you regard the hypothesis that the level equalisation "masks" rather than simple "removes" the difference as being scientifically or academically supportable? So far as I am concerned, the above questions assume you could arrange the protocol to chose lengths and pattern of sampling "A"/"B"/"X" as you would feel most reasonable. So - for example - if you'd personally prefer each "A", etc, to be half an hour, or twenty seconds, this can be assumed to be as you prefer. As a separate point, I would also be curious about your view on how significant a difference may be if it becomes un-detectable when the levels are equalised and each player was giving the desired loudness which you might prefer (without knowing which player was in use). As someone who used to design equipment I can appreciate there are times when even tiny effects may be worth persuing. However when in normal use, given all the other uncontrolled variables of domestic listening, it does seem to me that many of the 'differences' people debate may be of doubtful concern when simply listening to and enjoying music. Slainte, Jim -- Electronics http://www.st-and.ac.uk/~www_pa/Scot...o/electron.htm Audio Misc http://www.st-and.demon.co.uk/AudioMisc/index.html Armstrong Audio http://www.st-and.demon.co.uk/Audio/armstrong.html Barbirolli Soc. http://www.st-and.demon.co.uk/JBSoc/JBSoc.html I think it might be clearer, Jim, if I just outline my views on the subject and hope that you find this acceptable. I agree that, for the results to be meaningful in AB testing, levels should be equalised and regret that my devious attempts to wind up Stewart were misinterpreted. I do feel however that AB testing is possibly not a suitable test for revealing differences close to 1 jnd and is accurate to perhaps 5 jnd. My reasoning is that this is analogous to the case in the visual field where it is possible to view samples both simultaneously and serially but where discrimination is greatly reduced in the case of serial viewing. As you probably know, the specification of small colour differences is of significant commercial importance and international standards and an extensive body of research, both published and unpublished, exists. I published an early computer study on equal visual spacing as long ago as 1977, ("A two-dimensional colour diagram based on the sensitivity functions of cone vision". JOCCA, 1977, 60, 307-310). As for a test to determine whether AB testing is sufficiently sensitive to distinguish small audio differences I would propose the following: It should be possible to determine minimum audible differences of 1 jnd over a discrete range of frequencies on a test setup, say from 1000 to 15000Hz at 2000 Hz intervals. A set of digital AB samples to Red Book CD standard at normal listening levels would then be prepared, one of which would be a pure tone at each of the frequencies and the other would be the same tone corrupted at alternate values by positive then negative random increments of digital noise varying from 0 to 5 jnd. My prediction is that an AB test on these samples would not be able to distinguish differences of less than 5 jnd. Over to you. BTW Europe won the Ryder Cup by a record margin but I am sure that in St Andrews you already knew that :-) Alan. |
Older seperates vs new system
I took my dog out for a walk.
While it was ****ing on The Milkman's leg, he seemed distracted by: "Triffid" wrote: While it was ****ing on Smeghead's leg, he seemed distracted by: If you have hands, the ability to use them and a spare moment you could build a kit loudspeaker from the likes of Wilmslow or Falcon, IPL etc and have much better bang for the buck. Once upon a time, that was true. Don't they still sell the ATC kits?? They do, but I wouldn't trust my woodworking skills to make a decent bird-table let alone a cabinet worthy of something that quality. -- Despite appearances, it is still legal to put sugar on cornflakes. |
Older seperates vs new system
In article , Alan Murphy
wrote: "Jim Lesurf" wrote in message [big snip to avoid repeating my sequence of questions] I think it might be clearer, Jim, if I just outline my views on the subject and hope that you find this acceptable. I'm afraid that your mode of response does not actually seem 'clearer' to me as you have not dealt with the main issue I was specifically asking you about. (Please see below.) I agree that, for the results to be meaningful in AB testing, levels should be equalised and regret that my devious attempts to wind up Stewart were misinterpreted. OK. I do feel however that AB testing is possibly not a suitable test for revealing differences close to 1 jnd and is accurate to perhaps 5 jnd. [snip] I appreciate that you may "feel" something. I also appreciate that you might be correct. However, since you seem to be arguing on the basis of taking an academic approach founded upon applying the scientific method, my questions were to invite you to apply this to your own statements. As for a test to determine whether AB testing is sufficiently sensitive to distinguish small audio differences I would propose the following: It should be possible to determine minimum audible differences of 1 jnd over a discrete range of frequencies on a test setup, say from 1000 to 15000Hz at 2000 Hz intervals. A set of digital AB samples to Red Book CD standard at normal listening levels would then be prepared, one of which would be a pure tone at each of the frequencies and the other would be the same tone corrupted at alternate values by positive then negative random increments of digital noise varying from 0 to 5 jnd. My prediction is that an AB test on these samples would not be able to distinguish differences of less than 5 jnd. Over to you. My series of questions was partly to establish if I had understood you correctly. Partly to establish what test you had in mind that could be carried out and whose results could distinguish between your hypothesis that the failure was due to 'masking' and the alternative hypothesis that the failure was due to 'removal' of the actual differences. Unfortunately, your reply does not deal with this point. I carefully arranged my series of questions so that all but the last could be answered fairly quickly and simply with a 'yes' or a 'no'. I would have preferred this as it seems clearer to me that your restatement. However the key question was the last one (restated above) so I'd like to know your answer to this. Or do you accept that when you argue that the failure is due to 'masking' this is no more than a personal belief? The outline you give from visual experiments is an analogy. This may or may not be an appropriate analogy. To test this we would require a response to the question which you did not deal with. So. Can you now say what practical test/experiment you can suggest that would be useful to test your hypothesis that the failure is due to 'masking' rather than 'removal' of the audible difference? BTW Europe won the Ryder Cup by a record margin but I am sure that in St Andrews you already knew that :-) Heard about it on the radio. :-) However I tend to leave golf to the tourists. ;- Slainte, Jim -- Electronics http://www.st-and.ac.uk/~www_pa/Scot...o/electron.htm Audio Misc http://www.st-and.demon.co.uk/AudioMisc/index.html Armstrong Audio http://www.st-and.demon.co.uk/Audio/armstrong.html Barbirolli Soc. http://www.st-and.demon.co.uk/JBSoc/JBSoc.html |
Older seperates vs new system
"John Corbett" wrote in message ... In article , Stewart Pinkerton wrote: ... level-matched time-proximate ABX (and ABChr) testing has proven over many decades to be the *most* sensitive test for audible differences in sound quality. But typical ABX tests are often not as sensitive as they are thought to be. ABX is an elegant scheme for data collection, but data collection is only part of an experiment. It is the entire experiment's sensitivity that matters. Recall that a test is *sensitive* for a difference if the test is likely to detect that difference when the difference is present; a test is *specific* (i.e., selective) if it is unlikely to report a false positive result (when the difference is not present). Here is an example: If someone does not ever detect a difference, he will still get correct answers on 50% of trials in the long run just by random guessing. If someone always detects a difference, he will of course be able to score 100% correct. Often people take as a threshold the size of difference where someone would get 75% correct answers in an infinite sequence or repeated trials. Consider a difference large enough that a certain listener would get the correct answer for 90% of all trials (well above the 75% threshold). If we did an ABX test with just one trial for that subject, the sensitivity would be .90 but the chance of a false positive would be .50---way too high. So, we do a test with 16 trials with a passing score if the subject gets at least 14 correct responses. Now the type 1 error risk is small ( .01) but the sensitivity is only .79. In other words, we have made a more specific test, but it is _less_ sensitive that a single-trial test! A difference has to so large that subjects get correct answers on about 95% of individual trials before a 14-of-16 test is as sensitive as a single-trial test. (A 12-of-16 test is less specific, but far more sensitive.) Didn't understand a word of that myself, but I await the response to it with eager anticipation! :-) |
Older seperates vs new system
On Thu, 23 Sep 2004 14:14:11 +0100, "Keith G"
wrote: "John Corbett" wrote in message ... In article , Stewart Pinkerton wrote: ... level-matched time-proximate ABX (and ABChr) testing has proven over many decades to be the *most* sensitive test for audible differences in sound quality. But typical ABX tests are often not as sensitive as they are thought to be. ABX is an elegant scheme for data collection, but data collection is only part of an experiment. It is the entire experiment's sensitivity that matters. Recall that a test is *sensitive* for a difference if the test is likely to detect that difference when the difference is present; a test is *specific* (i.e., selective) if it is unlikely to report a false positive result (when the difference is not present). Here is an example: If someone does not ever detect a difference, he will still get correct answers on 50% of trials in the long run just by random guessing. If someone always detects a difference, he will of course be able to score 100% correct. Often people take as a threshold the size of difference where someone would get 75% correct answers in an infinite sequence or repeated trials. Consider a difference large enough that a certain listener would get the correct answer for 90% of all trials (well above the 75% threshold). If we did an ABX test with just one trial for that subject, the sensitivity would be .90 but the chance of a false positive would be .50---way too high. So, we do a test with 16 trials with a passing score if the subject gets at least 14 correct responses. Now the type 1 error risk is small ( .01) but the sensitivity is only .79. In other words, we have made a more specific test, but it is _less_ sensitive that a single-trial test! A difference has to so large that subjects get correct answers on about 95% of individual trials before a 14-of-16 test is as sensitive as a single-trial test. (A 12-of-16 test is less specific, but far more sensitive.) Didn't understand a word of that myself, but I await the response to it with eager anticipation! He's quite right statistically, but of course his point is irrelevant, since there's no point in high sensitivity without high confidence - just as in measurement, there's no point in high resolution without high accuracy, viz the myth that 24-bit digital has *any* higher resolution than 16-bit - if you're using analogue tape sources. -- Stewart Pinkerton | Music is Art - Audio is Engineering |
Older seperates vs new system
"Jim Lesurf" wrote in message
... In article , Alan Murphy wrote: "Jim Lesurf" wrote in message [big snip to avoid repeating my sequence of questions] I think it might be clearer, Jim, if I just outline my views on the subject and hope that you find this acceptable. I'm afraid that your mode of response does not actually seem 'clearer' to me as you have not dealt with the main issue I was specifically asking you about. (Please see below.) I agree that, for the results to be meaningful in AB testing, levels should be equalised and regret that my devious attempts to wind up Stewart were misinterpreted. OK. I do feel however that AB testing is possibly not a suitable test for revealing differences close to 1 jnd and is accurate to perhaps 5 jnd. [snip] I appreciate that you may "feel" something. I also appreciate that you might be correct. However, since you seem to be arguing on the basis of taking an academic approach founded upon applying the scientific method, my questions were to invite you to apply this to your own statements. As for a test to determine whether AB testing is sufficiently sensitive to distinguish small audio differences I would propose the following: It should be possible to determine minimum audible differences of 1 jnd over a discrete range of frequencies on a test setup, say from 1000 to 15000Hz at 2000 Hz intervals. A set of digital AB samples to Red Book CD standard at normal listening levels would then be prepared, one of which would be a pure tone at each of the frequencies and the other would be the same tone corrupted at alternate values by positive then negative random increments of digital noise varying from 0 to 5 jnd. My prediction is that an AB test on these samples would not be able to distinguish differences of less than 5 jnd. Over to you. My series of questions was partly to establish if I had understood you correctly. Partly to establish what test you had in mind that could be carried out and whose results could distinguish between your hypothesis that the failure was due to 'masking' and the alternative hypothesis that the failure was due to 'removal' of the actual differences. Unfortunately, your reply does not deal with this point. I carefully arranged my series of questions so that all but the last could be answered fairly quickly and simply with a 'yes' or a 'no'. I would have preferred this as it seems clearer to me that your restatement. However the key question was the last one (restated above) so I'd like to know your answer to this. Or do you accept that when you argue that the failure is due to 'masking' this is no more than a personal belief? The outline you give from visual experiments is an analogy. This may or may not be an appropriate analogy. To test this we would require a response to the question which you did not deal with. So. Can you now say what practical test/experiment you can suggest that would be useful to test your hypothesis that the failure is due to 'masking' rather than 'removal' of the audible difference? When I wrote the original post with the term 'masking' I was not aware that it has a particular meaning and significance in audio science, not being familiar with the literature. Having read some relevant papers I now realise that I should have replaced 'masking' with 'the test is not sufficiently sensitive to reveal differences which may be present when using some other method'. My apologies for not being sufficiently clear. The test that I proposed above would indeed reveal whether the AB test is insensitive and to what degree, concordant with the scientific method. I do not have enough knowledge of the subject to propose a test that would detect low jnd differences in complex scenarios. Incidentally during my "googling" I did notice some suggestion that when different signals are presented simultaneously to seperate ears much smaller differences can be detected than when these signals are presented to both ears serially. Any ideas on this? Alan. [snip of idle golf chatter :-)] Slainte, Jim -- Electronics http://www.st-and.ac.uk/~www_pa/Scot...o/electron.htm Audio Misc http://www.st-and.demon.co.uk/AudioMisc/index.html Armstrong Audio http://www.st-and.demon.co.uk/Audio/armstrong.html Barbirolli Soc. http://www.st-and.demon.co.uk/JBSoc/JBSoc.html |
All times are GMT. The time now is 04:48 AM. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.0.0
Copyright ©2004-2006 AudioBanter.co.uk