April 18, 2007
I am incapable of providing timely commentary on things … this has been stewing for a couple of weeks now.
The MiniBooNE collaboration recently released initial results searching for muon neutrinos turning into electron neutrinos. A very nice and detailed discussion of the physics and experiment is here, and I won’t repeat it; instead I’m going to talk a bit about the kind of analysis they did — a “blind analysis.”
“Blind trials,” famous from medical studies, attempt to eliminate biases by keeping the test subjects from knowing whether they’re getting Coke or Pepsi (less frivolously, a treatment or a placebo). Double-blind trials hide this information from the experimenters as well. If the subjects in a taste test know they’re getting Coke or Pepsi, their reactions may have nothing to do with how the drinks actually taste; if doctors treating a patient know if the patient is actually getting a sugar pill, they could (even just subconsciously) affect the patient by behaving differently. Double-blind experiments are critical for good results when living subjects are involved.
Things are a little different for particle physics. Quantum mechanics doesn’t care what we subconscious signals we give out; the first aspect of a double-blind trial – keeping the subject in the dark – is automatically satisfied. However, experimenters can influence how they collect and analyze their data, and so can introduce biases that way: the second aspect is generally not considered.
Experimenter bias has historically been dealt with by consciously trying to be unbiased, but it’s not hard for good intentions to go wrong. Suppose you’re making a measurement of a well-known physical quantity. You find that you are way off. You root around and you find a problem with what you’ve done, which when fixed brings your result into agreement with the “correct” value. Do you stop and declare victory? Well, you shouldn’t necessarily — things that are less obvious, or which compensate each other, may still be incorrect. However, most of us will, in fact, stop. The net effect is a bias towards previous results, which may themselves be wrong (or at least not known to high enough precision), and it’s hard to avoid this.
A related issue arises in searches for new phenomena. If you see a small discrepancy between what you observe and what you expect at some particular value of the Higgs mass, the temptation is to focus on it, see whether the events are special, and so on; but one doesn’t do the same amount of work when the observation and expectation appear to agree. It’s hard to tell how significant such discrepancies are, because there were lots of places you might have seen bumps but didn’t.
To work around these issues, people use “blind analyses.” I first became aware of these when KTeV used one for its analysis of CP violation in kaons. The main idea is to prevent the experimentalist from seeing “the answer,” in whatever form it might take, until the very end. The act of unblinding is supposed to be the last thing you do, and you are stuck with the result you get: if you change it you’ve negated the whole point of doing it blind!
I’ve heard of blinding being done in a couple of ways. If you’re trying to measure a quantity precisely, one way to do it is to arrange for an offset (unknown to you) to be combined with the result you see while you’re preparing your analysis. You can then proceed as you would normally, except that your ability to tweak the real result to agree with previous knowledge is gone.
Alternatively, you can hide the data of interest from yourself — this was the MiniBooNE approach. You choose a class of events that would contain the signal you seek. These selection criteria create a “box” that you keep “closed”: you arrange not to see any passing events. You calibrate your understanding of the detector with data sharing some commonality with what you’re interested in — the same particles detected in a different configuration, for example. You select these “sideband” regions as well as you can to test for all the effects you can imagine would give you the wrong answer. Once you think you understand all the physics going on “around” the box, you have some confidence that you understand what to expect in the box, and then you can “open” it.
You may already have seen the danger with the second kind of blinding: aberrations that show up in the box may not be visible anywhere else. For example, one of the first LIGO results searched for short, bursty gravitational waves; when they opened the box, they found an event, which were almost immediately attributed to an airplane flying over a detector — but following the blinding protocol, they couldn’t remove it from their data sample.
MiniBooNE went through a very involved unblinding procedure to obtain their result. They performed the neutrino oscillation fit, and had the software tell them how consistent the fit was with the observed data (still in the closed box!) without revealing the fit parameters. In short, this told them if they could give a reasonable description of the box, without actually revealing what the description was. In fact, this revealed a problem, and did so before they had committed themselves to a full box-opening. They were able to tighten their selection before going any further. It turns out there is an excess of low energy events (the origin of which, as far as I know, is still unknown, but doesn’t seem to be oscillations), which would have seriously mucked up their result if they hadn’t been able to remove it from their fit. MiniBooNE illustrates the benefits of closed-box analysis (they might have spent a lot of time trying to get the excess to go away and stopped when it looked like a no-oscillation result), the dangers (they didn’t fully understand the box), and an interesting approach to trying to detect such problems beforehand (a sort of non-invasive box examination).
What about me? Strict blind analyses are painfully time- and labor-intensive, and I’ve never actually done one. My current work is sort of blind, in that it is next to impossible to figure out the final answer without running a specific program, and I can avoid doing that (“obscurity through laziness”). However I don’t have protocols that forbid me from fixing an obvious mistake after the program has been run. (I could certainly implement a “χ2 consistency check” before I let myself see the fit values, though. I’ll think about it.)