Seth Brown, a “data scientist” with a Ph.D. in computational genomics, has done several experiments about the best way to make coffee. In one, he compared other people’s burr grinders to his blade grinder. There was no clear difference in taste. In another, an Aeropress apparently produced better-tasting coffee than drip extraction. He hasn’t found other factors that matter. If I drank coffee, I’d be happy to know these things.
If I were teaching how to do experiments, his work would be a good case study. I’d have my students read it and suggest improvements. The contrast between his data analysis (sophisticated) and experimental design (unsophisticated) is striking, maybe because he has no background in experimentation.
Here’s what I would have done differently:
1. Study my reactions, not the reactions of guests. He had house guests rate the coffee he made. Yet he brews coffee for himself much more often than for others — at least, he gives that impression. Since his main customer is himself, it wasn’t clear why other people’s opinions are more important than his opinion. Maybe he read somewhere that blinding is good and thought it would be easier to achieve if other people did the ratings. He could have rated coffee he made himself blinded. Put stickers on the bottom of identical cups, shuffle the cups. However, since he will usually make coffee unblinded (he will know how he made it), it isn’t clear that blinding is good.
2. No “control” experiments. In a “control” experiment, he asked guests which of two identically-made cups of coffee was better. He doesn’t say what he learned from this — apparently nothing.
3. Simultaneous presentation. He gave guests two cups of coffee made differently and asked which they preferred. Apparently he gave them one cup at a time. Simultaneous presentation, allowing them to go back and forth, would have allowed much better discrimination. Maybe the two types of grinder differed but his experiment was too noisy to detect this.
In a footnote he wrote:
Ideally, I would have liked to use better control conditions [he appears to realize that there was something wrong with his control experiment — SR], larger sample sizes, more thorough subject randomization [I have no idea what this means; his designs are within-subject. In within-subject experiments, subjects are not randomized — SR], and a more consistent testing environment.
All of these changes would have made his experiments more difficult. Maybe he has internalized the rule harder is better.
The beginning of wisdom about science is roughly the opposite: do the simplest easiest thing that will tell you something. We always know less than we think, so make as few assumptions and as little investment as possible. The easier your experiment, the less you will lose if you make a wrong assumption. The smaller your sample size, the more resources (time, money, subjects, energy) you will have left over for other experiments. Bunsen’s experiments would have been easier if he had studied himself. By studying others, he made an untested assumption that they resembled him.
I’ve done dozens of tea experiments in which I compared tea brewed two different ways. The main things I’ve learned, besides best brew times and best amounts of tea to use, are: 1. Rinse tea before brewing. It eliminates a kind of dirty taste. 2. Combine chocolate tea and black tea. The combination is better than either alone. 3. A little bit of salt helps.