Coffee Experiments: Suggestions for Improvement

Seth Brown, a “data scientist” with a Ph.D. in computational genomics, has done several experiments about the best way to make coffee. In one, he compared other people’s burr grinders to his blade grinder. There was no clear difference in taste. In another, an Aeropress apparently produced better-tasting coffee than drip extraction. He hasn’t found other factors that matter. If I drank coffee, I’d be happy to know these things.

If I were teaching how to do experiments, his work would be a good case study. I’d have my students read it and suggest improvements. The contrast between his data analysis (sophisticated) and experimental design (unsophisticated) is striking, maybe because he has no background in experimentation.

Here’s what I would have done differently:

1. Study my reactions, not the reactions of guests. He had house guests rate the coffee he made. Yet he brews coffee for himself much more often than for others — at least, he gives that impression. Since his main customer is himself, it wasn’t clear why other people’s opinions are more important than his opinion. Maybe he read somewhere that blinding is good and thought it would be easier to achieve if other people did the ratings. He could have rated coffee he made himself blinded. Put stickers on the bottom of identical cups, shuffle the cups. However, since he will usually make coffee unblinded (he will know how he made it), it isn’t clear that blinding is good.

2. No “control” experiments. In a “control” experiment, he asked guests which of two identically-made cups of coffee was better. He doesn’t say what he learned from this — apparently nothing.

3. Simultaneous presentation. He gave guests two cups of coffee made differently and asked which they preferred. Apparently he gave them one cup at a time. Simultaneous presentation, allowing them to go back and forth, would have allowed much better discrimination. Maybe the two types of grinder differed but his experiment was too noisy to detect this.

In a footnote he wrote:

Ideally, I would have liked to use better control conditions [he appears to realize that there was something wrong with his control experiment — SR], larger sample sizes, more thorough subject randomization [I have no idea what this means; his designs are within-subject. In within-subject experiments, subjects are not randomized — SR], and a more consistent testing environment.

All of these changes would have made his experiments more difficult. Maybe he has internalized the rule harder is better.

The beginning of wisdom about science is roughly the opposite: do the simplest easiest thing that will tell you something. We always know less than we think, so make as few assumptions and as little investment as possible. The easier your experiment, the less you will lose if you make a wrong assumption. The smaller your sample size, the more resources (time, money, subjects, energy) you will have left over for other experiments. Bunsen’s experiments would have been easier if he had studied himself. By studying others, he made an untested assumption that they resembled him.

I’ve done dozens of tea experiments in which I compared tea brewed two different ways. The main things I’ve learned, besides best brew times and best amounts of tea to use, are: 1. Rinse tea before brewing. It eliminates a kind of dirty taste. 2. Combine chocolate tea and black tea. The combination is better than either alone. 3. A little bit of salt helps.

Progress in Psychiatry and Psychotherapy: The Half-Full Glass

Here is an excellent introduction to cognitive-behavioral therapy (CBT) for depression, centering on a Stanford psychiatrist named David Burns. I was especially interested in this:

[Burns] currently draws from at least 15 schools of therapy, calling his methodology TEAM—for testing, empathy, agenda setting and methods. . . . Testing means requiring that patients complete a short mood survey before and after each therapy session. In Chicago, Burns asks how many of the therapists [in the audience] do this. Only three [out of 100] raise their hands. Then how can they know if their patients are making progress? Burns asks. How would they feel if their own doctors didn’t take their blood pressure during each check-up?

Burns says that in the 1970s at Penn [where he learned about CBT], “They didn’t measure because there was no expectation that there would be a significant change in a single session or even over a course of months.” Forty years later, it’s shocking that so little attention is paid to measuring whether therapy makes a difference. . . ”Therapists falsely believe that their impression or gut instinct about what the patient is feeling is accurate,” says May [a Stanford-educated Bay Area psychiatrist], when in fact their accuracy is very low.

When I was a graduate student, I started measuring my acne. One day I told my dermatologist what I’d found. “Why did you do that?” he asked. He really didn’t know. Many years later, an influential psychiatrist — Burns, whose Feeling Good book, a popularization of CBT, has sold millions of copies — tells therapists to give patients a mood survey. That’s progress.

But it is also a testament to the backward thinking of doctors and therapists that Burns didn’t tell his audience:

–have patients fill out a mood survey every day
–graph the results

Even more advanced:

–use the mood scores to measure the effects of different treatments

Three cheap safe things. It is obvious they would help patients. Apparently Burns doesn’t do these things with his own patients, even though his own therapy (TEAM) stresses “testing” and “methods”. It’s 2013. Not only do psychiatrists and therapists not do these things, they don’t even think of doing them. I seem to be the first to suggest them.

Thanks to Alex Chernavsky.

“Trying to Confuse You”: Pluses and Minuses of the Professorial Value System

A Chinese friend of mine is a chemistry major. In one of her classes, the textbook was so hard to understand she said the authors are “trying to confuse you.” They use difficult words, for example. A Berkeley art history major told me much the same thing. In her reading assignments, she said, the writers couldn’t write a sentence without a few big words. They were trying to impress readers, she believed.

Yes, professors write badly — in these two cases, the writing seemed actively bad. Thorstein Veblen wrote a whole book about showing off (The Theory of the Leisure Class). One chapter was about professors. They show off, said Veblen, by doing research with no practical application and by writing obscurely. Obscure writing is showing off because, like useless research, it shows you don’t have to care what other people think (“it carries a pointed suggestion of the industrial exemption of the speaker”).

Veblen said little about the costs and benefits of the behavior he described, beyond calling it wasteful. I say the opposite — not wasteful at all. When, long ago, people bought “useless” (“deadweight loss”) gifts or “useless” hood ornaments or decorated buildings with “useless” ornamentation or performed “useless” rituals and ceremonies that require special products (e.g., special clothes), they subsidized skilled artisans. For a long time, that was incredibly important. Research by skilled artisans led to better tools, the creation of metals, and so on. Helping those artisans make a living supported (increased) research in material science. Pushing people toward “useless” research was valuable because it diversified the research being done — there are many ways to be useless, just as you can misspell a word more ways than you can spell it correctly. The most important discoveries, such as electricity, would not have been made if everyone tried to do research with obvious application. Allowing professors to use big words and write badly is a small price to pay for the valuable “useless” research they perform.

———————————————–

Ad

I use Grammarly for proofreading because . . . well, just because.

———————————————–

There is an unrecognized problem with this, however. If you get one group of people to do “useless” research by turning things upside down so that useless is seen as better than useful (professors value “pure” research over “applied” research), it becomes very hard for them to do useful research. For a long time, practically all important research was material science research — how to control the material world. When something useful was discovered via “useless” research, the knowledge could be transferred to everyone else, who had normal values (useful is better than useless). Everyone else went on to use the knowledge in profitable ways — to make better knives, for example. This system (the results of “useless” research are used by other people to make a profit) gave us the world we live in, a world of wonderful products. The products on offer are staggering in their diversity, low cost, and general excellence. The hard drive on my laptop, the clothes I wear, for example.

Against this brilliant control of materials we can put our amazing lack of control of our bodies. A large fraction of Americans sleep poorly. Nothing (such as street noise) is making them sleep badly; they just don’t know how to sleep well. Depression is a huge problem, obesity is a huge problem (in America), and so on. It isn’t just ordinary people. Sleep experts don’t know how to improve sleep, weight control experts don’t know how to lose weight, psychiatrists don’t know how to prevent depression, and so on. Closely related to this is our health care system. It is dominated by doctors, who often use a peculiar and self-serving reasoning I call doctor logic. When I was a graduate student, my dermatologist was surprised when I measured my acne to see if the treatments he prescribed actually worked. It was a new idea to him. An influential Stanford psychiatrist named David Burns, whose famous book has sold millions of copies, has not yet figured out it would be a good idea to measure daily the mood of his patients. (Other psychiatrists are even worse.)

Why are we so smart about materials and so stupid about health — which is far more important? I think it is because the whole system evolved to push our economy forward via advances in material science. For hundreds of thousands of years, that is where improvement was possible: better stuff, such as better tools. The same “habits of mind” (as Veblen would say) and research system has managed to produce plenty of “useless” knowledge outside of material science. This knowledge can be translated into useful discoveries, as I have done (new ways to sleep better, lose weight, be in a better mood, and so on), but these discoveries don’t lead to products, at least not in obvious ways. Control of our bodies is quite different than making something physical. My first interesting self-experimental discovery was that eating breakfast made my sleep worse. That’s very useful, but not at all profitable — there is no obvious associated product. For professors, a problem with my discovery is that it’s useful. (Another problem is that it’s small.) For everyone else, a problem is that it isn’t profitable. The system that worked so well for material science breaks down when it comes to health science.

Yet the fact that you are reading this suggests, at least to me, that a big change is coming.

 

Hobbyist Science vs. Professional Science vs. Personal Science

In a TED talk, Paula Scher, a graphic designer, told how a hobby of painting maps turned into something like a job.

I was up in my country house, and for some reason, I began painting these very big, very involved, laborious, complicated maps . . . They would take me about six months initially, but then I started getting faster at it. Here’s the United States. Every single city of the United States is on here. . . . One of my favorites was this painting I did of Florida after the 2000 election that has the election results rolling around in the water. . . . Somebody . . . saw the paintings and recommended them to a gallery, and I had a first show about two-and-a-half years ago, and I showed these paintings that I’m showing you now. . . . They sold quickly, and became rather popular. . . . The gallery wanted me to have another show in two years, which meant that I really had to paint these paintings much faster than I had ever done them. . . . I was no longer at play. I was actually in this solemn landscape of fulfilling an expectation for a show, which is not where I started.

A hobby turned into a job. This has happened countless times — I believe all jobs started as hobbies.

One hobby that turned into a job is science. The first scientists were hobbyists — for example, Darwin and Mendel. The success of hobbyist scientists led to the creation of full-time jobs that included doing science — professors of science at universities. When science became a job, something was gained (professionals had more time per day, money, training, institutional support, collegial support, and prestige than hobbyists) and something was lost (professionals had less freedom than hobbyists). Professionals could do many things hobbyists could not, but the reverse was also true: hobbyists could do many things professionals could not. For example, they could work on a question for ten years without publishing anything (Mendel, Darwin) and entertain highly heretical ideas (Darwin). Professionals needed steady output and dared not offend, for fear of losing their job.

My personal science (personal science = using science to help yourself) is another step in this history. I combined the freedom of hobbyists with the knowledge, skills and resources of professionals. I can do whatever self-experiments I want and test whatever ideas I want. Yet I also have professional levels of training, knowledge, skill, and (to some extent) equipment provided by my job as a psychology professor, Berkeley library access, the Internet, free software, and cheap computers. To these two elements — the freedom of hobbyists, the resources of professionals — my personal science added a third element not found in hobbyist or professional science: the motivation of a person with a problem. I wanted better health. My personal science helped me get it. In the beginning, I wanted to sleep better, lose weight, have less acne, and be in a better mood. Later, I discovered new ways to improve my brain function and blood sugar. Just combining the freedom of hobbyists with the resources of professionals, personal science would probably be a big improvement. Adding better motivation suggests that personal science is even more likely to improve our lives by learning what professional scientists haven’t learned. The combination of professional science and personal science will be far more powerful (= more useful) than professional science alone.

I’ve seen this in my own life, over and over, and I predict it will eventually be true for everyone. Learning how to control one’s own health — how to sleep well, for example — is non-trivial knowledge.

Rewarding Criticism Put Nicely Produced Long-Lasting Change

Eliezer Yudkowsky, I’m told, used to be a not-nice critic. The problem was his delivery: “blunt, harsh, not sufficiently tempered by praise for praiseworthy things” (Alicorn Finley). However, this changed about a year ago, when Anna Salamon and Alicorn Finley decided to try to train him to be nicer. Alicorn describes it like this:

Me, Eliezer, Anna, and Michael Blume were all sitting in my and Michael’s room (where we lived two houses ago) working on, I think it was, a rationality kata [= way of doing things], and we were producing examples and critiquing each other. Eliezer sometimes critiqued in a motivation-draining way, so we started offering him M&Ms when he put things more nicely. (We also claimed M&Ms when we accomplished small increments of what we were working on.)

Eliezer added:

Some updates on that story. M&M’s didn’t work when I tried to reward myself with them later, and I suspect several key points:

1) The smiles/approval from the (highly respected) friends feeding me the M&Ms probably counted for more than the taste sensation.

2) Being overweight, M&Ms on their own would be associated with shame/guilt/horror/wishing I never had to eat again etc.

3) Others have also reported food rewards not working. One person says that food rewards worked for them after they ensured that they were hungry and could only eat via food rewards.

4) I suspect that the basic reinforcement pattern will only work for me if I reward above-average performance or improvement in performance (positive slope) rather than trying to reward constant performance, because only this makes me feel that the reward is really ‘deserved’.

Also:

  • Andrew Critch advises that ‘step zero’ in this process is to make sure that you have good internal agreement on wanting the change before rewarding movements in the direction of the change
  • The Center for Applied Rationality (CFAR) has some experience learning to teach this.
  • CFAR has excellent workshops but not much published/online material. A good mainstream book is Don’t Shoot the Dog by Karen Pryor.

I like this example because the change was long-lasting and important.

Assorted Links

Thanks to Alex Chernavsky.

Organic Pollutants Associated With Diabetes

Everyone knows that diabetes is associated with obesity, probably because obesity causes diabetes. However, thin people also become diabetic. A clue to why is provided by the correlation between diabetes and what are called “persistent organic pollutants” (POPs). POPs are man-made organic compounds, usually pesticides, such as polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans.

A 2006 study using NHANES (National Health and Nutrition Examination Survey 1999–2002) data found very strong associations between levels of these chemicals and diabetes. For example, a risk ratio of 30. These associations persisted even when the data was stratified in all sorts of ways. The scariest result came from people who had BMI < 25. Looking only at such people, those above the 90th percentile for amount of POPs had 16 times the risk of diabetes as those below the 25th percentile. Here is something associated with thin people getting diabetes.

Does the association exist because POPs cause diabetes? You might argue that POP exposure is correlated with poverty (poor people are more exposed), poor people exercise less than rich people, and lack of exercise causes diabetes. However, Agent Orange exposure among soldiers is associated with diabetes. That is unlikely to be due to confounding with poverty or lack of exercise.

Everyone has these chemicals in their body, but almost no one knows how much. I don’t know if I’m in the 10th percentile or the 90th percentile. If I’m in the 90th percentile, what can I do about it? A good place for self-measurement and tracking.

Assorted Links

  • The increasing popularity of kvas. “We ferment with ginger and, I believe, longer than other people – for seven to 10 days.”
  • Giving up wine (and other alcohol) for a month. Before this he drank 2 glasses of wine/day.
  • Wellness Mart (in California) makes it easy to get basic medical tests. “ In California, you are required to have an order from a doctor for blood tests, but WellnessMart, MD stores all have medical doctors on staff. Our doctors allow their license to be used for basic screening tests because there are some things that really shouldn’t be that difficult to find out. If you don’t have a doctor’s order and you want to run tests that aren’t a part of our standard screening packages, you will be charged a MD Consultation Fee of $25. Our doctor will help you to put together a panel that will accomplish the goals you are looking to accomplish. If the doctor determines that it is not appropriate for you to run the tests you want to run at WellnessMart, MD there will be no charges.”
  • Riding a bike while learning Polish. It helps.

Thanks to Casey Manion and Adam Clemens.

Was Sisyphus in Hell . . . or Heaven?

In third grade, I learned that Sisyphus was condemned to an eternity of pushing a rock up a hill, the rock rolls back down, he pushes it up again, and so on. Why the Greeks told this story I had no idea, and still don’t.

I am now moving — from an apartment in the basement of a house to an apartment on the top floor of the same house. I’ve discovered that in small amounts this is enjoyable. I enjoy carrying stuff up a bunch of stairs. I could do it an hour per day forever — like Sisyphus, except with time off.

Here is the downside of the occupational specialization that distinguishes humans from other species. I don’t need to haul stuff upstairs one hour per day. People move stuff for a living. Instead I walk uphill on my treadmill, a imitation activity that does nothing for my upper body. I could move heavy stuff around my apartment, but that’s boring. The situation reminds me of the way Japanese schoolchildren clean up their school every day. In small amounts, cleaning is fun. Whoever runs Japanese schools has figured this out and used this fact to everyone’s benefit. Blogging is another example. In small amounts, writing and being read is fun. The communication this enables helps everyone. When writing becomes a job, a lot is lost — much less diversity of points of view. Those who write for a living are afraid of losing their jobs, reducing even further what can be said.

This blog is all about the fact that science is still another example. In small amounts, doing science is fun, especially when it has practical benefit (e.g., sleep better). Professional scientists have their place, just as professional movers, janitors, and writers have their place. But people who do science purely for their own ends — just as I move stuff upstairs purely for myself — have their place too. I am not as strong as a professional mover but I make up for it in dozens of ways. Personal scientists don’t have the resources (e.g., expensive equipment) of professional scientists, but they make up for it in dozens of ways. Without them, the diversity of ideas that are taken seriously (e.g., tested) goes way down.

Assorted Links

  • Open Source Malaria
  • Criticism of Malcolm Gladwell by The Korean, Gladwell’s persuasive rebuttal, more from The Korean, more from Gladwell. I thought the work under discussion (“ethnic theory of plane crashes”) was the best part of Outliers. Gladwell summarizes it: “That chapter in Outliers is about a series of extraordinary steps taken by Korean Air, in which an institution on the brink of collapse and disgrace turned themselves into one of the best airlines in the world. They did so by bravely confronting the fact that a legacy of their cultural heritage was frustrating open communication in the cockpit. That is not a slight on Korean culture, or any other high-power distance culture for that matter.”
  • More praise for the new TV show Naked and Afraid on the Discovery Channel. It really is riveting.
  • Ziploc omelette. Poor man’s sous vide.

Thanks to Nicole Harkin.