{self-experimentation, Internet, . . .}

For the non-set-theorists, I’m using braces to express set membership:

pets = {cats, dogs, . . . }.

A week ago self-experimentation and the Internet struck me as wildly different. Self-experimentation is a tiny method of inquiry. The Internet is a gigantic physical network. Self-experimentation: one person alone. The Internet: everyone together.

But then I read this fresh essay by Rishab Aiyer Ghosh, managing editor of the on-line journal First Monday. Thanks to Ghosh, I now see two similarities between self-experimentation and the Internet:

1. Both are growth media. They encourage things to grow. Self-experimentation helps develop new ideas about health. The Shangri-La Diet is an example; so are my ideas about mood. The most influential example is diabetes self-monitoring, which grew from self-experimentation by Richard Bernstein. The Internet, of course, has helped many things grow, especially new businesses (Ebay, Google), new forms of interpersonal communication (blogs, forums, chat rooms, MySpace), and new forms of collaborative work (Wikipedia, open-source software).

2. They encourage the growth of similar things. Self-experimentation doesn’t equally encourage all ideas about health; it especially encourages very low-cost ones. My self-experimentation led me to realize the benefits of skipping breakfast (improves sleep), seeing faces in the morning (improves mood), and standing a lot (improves sleep). The Shangri-La Diet costs almost nothing — less than nothing if you count the money saved on food. Ghosh points out that the Internet has especially encouraged the rise of businesses where the basic transaction does not involve money. Stuff is “given away” (that is, no money changes hands); payment is in terms of reputation. Both self-experimentation and the Internet are focusing intellectual attention on how people lived and thrived many thousands of years ago.

Life is Complicated

Yesterday morning I listened to Ira Glass. Yesterday evening I listened to Bill McKibben. And I reflected:

1. Bill McKibben wrote a whole book, The Age of Missing Information (1992), about the malign influence of TV. He spent a year watching a single day’s output of the 100-odd channels of one cable company. TV makes people self-centered, he decided.

2. Ira Glass said we are living in a Golden Age of Television and listed a handful of current shows — including The Wire, The Daily Show, Colbert, Friday Night Lights, Project Runway, Entourage, House, and “anything with Ricky Gervais” — in support of his claim. He has just spent a year starting a TV version of This American Life.

3. Bill McKibben wrote an article (in The Nation) praising This American Life to the skies.

I think of McKibben and Glass as the two Boy Geniuses of American intellectual life. (Curiously I cannot think of any Girl Geniuses.) Both of them did great work while really young. When McKibben was in his twenties, he wrote a long series of editorials in The New Yorker that were inspiring. (They were unsigned. I found out who wrote them by writing to the magazine.) His first book, The End of Nature (1989), about global warming, was prophetic. I think it was the very first general-audience book on the subject. As for Glass, This American Life was terrific right from the start, twelve years ago. He was 36 when it started.

At the Berkeley Farmers’ Market

Yesterday I went to the Berkeley Farmers’ Market and had a very interesting conversation with one of the vendors.

1. Whole Foods had called her and asked her if she would like to put her product in their stores. No thanks, she said. “Are you kidding?” they said. No, she said. She didn’t want to put her product in their stores because she didn’t want that sort of volume. She was more interested in supporting smaller stores. She told me that Whole Food’s increased interest in local vendors had come about because of Michael Pollan’s criticism of Whole Foods in The Omnivore’s Dilemma. (Pollan had coined the term Big Organic and wondered which side — the more virtuous or the less virtuous — Whole Foods was on.)

2. The vendor next to her, The Fatted Calf, who sell salami, beef jerky, sausage, duck confit, and other meat products, had been forced to stop selling to stores and restaurants when someone called the USDA to complain that they didn’t have an office for the USDA inspector. That’s right: no matter how small your business, you must have an office for the USDA inspector. It’s an absurd burden to put on a small business. As I have heard others say, big businesses welcome government regulation. Because they can afford it and their potential future competitors, now tiny, cannot. Supposedly the regulation protects consumers; it may or may not but it certainly protects big businesses. (Does requiring an office for a USDA inspector protect consumers? I think not.) We need organic consumer protection. The current version is like heavy-duty insecticide. It kills small businesses.

Science in Action: Omega-3 (old data re-analysed)

A few months ago I did a little experiment to test my belief that omega-3 was affecting my balance. I replaced fats high in omega-3 (flaxseed oil and walnut oil) with a fat low in omega-3 (sesame oil). Here is a new analysis of the data:

walnut oil and flaxseed oil versus sesame oil

The raw data are the same. The new analysis differs from the earlier analysis in two ways: 1. How the number for each day is computed. The old analysis dropped the first 5 trials and took the mean of the rest. The new analysis fits a regression line to balance as a function of trial to estimate an effect of trial and subtract it, then takes a mean of all the trials. 2. Allowance for improvement. The new analysis, as the graph shows, fits a slope to all the data. The improvement over days is subtracted from each day’s score before the two conditions are compared.

The old analysis gave t = 4.1 (p = very tiny). The new one gives t = 6.3 (p = very very tiny). Big improvement!

Directory of my omega-3 posts.

Science in Action: Omega-3 (what’s the best dose?)

With a better understanding of how to measure balance, I looked again at my data about the effects of flaxseed oil. Here is a new, improved comparison of 2 tablespoons/day and 3 tablespoons/day:

2 vs 3 tablespoons/day

Very clear difference: one-tailed p = .004.

Here is a messy comparison between 3 and 4 tablespoons/day:

3 vs 4 tablespoons/day

I compared 3 tablespoons/day at 2 different times with 4 tablespoons/day divided between those 2 times. I didn’t want to take 4 tablespoons at one time and I wanted to have at least 2 tablespoons in the evening because of the sleep benefits. The graph shows that 4 tablespoons/day has about the same effect as 3.

The big picture: Earlier data convinced me there is probably an effect. Before doing more subtle, convincing, publishable experiments, I have been trying to make the effect as large as possible. For two reasons: 1. To make the effect as clear as possible. 2. To have the most beneficial possible baseline (a baseline to which I will return many times). I foresee doing an experimental design like this: baseline (n days), something else 1 (n days), baseline (n days), something else 2 (n days), baseline (n days), something else 3, and so on. During those many baseline days I want the effect to be as strong as possible.

Science in Action: Omega-3 (measurement improvement)

I’ve learned a few things. As some of you may know, I’ve been measuring my balance by standing on a board that is balanced on a tiny platform (a pipe plug) — pictures here. Now and then the board would slip off the platform. I supposed this was a failure of balance but I wasn’t sure, especially if it happened as soon as I stood on it. So I got another board into which my brother-in-law kindly drilled the perfect-size hole so that the plug will never slip:

New board (with hole for plug)

To see if this made a difference I did an experiment with a design I have never used before but that I really like: ABABABAB… (one day per condition). In other words, Monday I tested my balance with the old board, Tuesday with the new board, Wednesday with the old board, Thursday with the new board, etc. Simple, efficient, well-balanced. Here are the results:

new board vs. old board

The red line is fit to the red points, the blue line to the blue points. The two lines are constrained to have the same slope.

Well, that’s clear. I expected my balance to be better with the new board, actually.

Speaking of the unexpected, I made another measurement improvement that truly surprises me — the surprise is that I never did it before. When I looked at my early balance data (the first 10 or so days of data) I saw that my balance improved for the first 5 trials and was roughly constant after that. Each session was 20 trials so I dropped (excluded) the first 5 trials from my analyses — considering them “warm-up” trials. I took the mean of the last 15 trials. That seemed very reasonable and I thought nothing of it.

Recently I asked again how performance changes over a session. The answer was a bit different: I found that performance improved for the first 10 trials. Now there are 30 trials in a session, so dropping the first 10 of them seemed okay. And that’s what I did.

But then I looked at how variability changed over a session. I expected the earliest trials to be more variable than the rest but the data didn’t show that. Variability was pretty constant from the first trials to the last. Hmm. Maybe I am losing valuable information by not including those early trials in my averages. It occurred to me: why not allow for the warmup effect by modelling it, rather than by excluding it? (Modelling it meaning estimating it and then subtracting it.) I did that, and then I looked at the size of the standard errors of the means (standard errors based on the residuals from the fit) for the most recent 40 days — essentially, the error in measurement. Here is what I found. Median standard errors:

First 10 trials (out of 30) excluded: 0.073
First 5 trials excluded: 0.064
First trial excluded: 0.061
No trials excluded: 0.059

My eyes opened wide when I saw these numbers. Oh my god! I was throwing away so much! A reduction in error from 0.073 to 0.059 — that’s 20% better.

The Berkeley School Lunch Program: Correction

After I mentioned that the Berkeley lunch program was in poor shape, Ann Cooper, the chef in charge, invited me to visit — to set the record straight. It was quite an opportunity; the Berkeley lunch program, some hope, will become a model for the whole country. This is why there was a long New Yorker article recently about what Cooper is doing.

Chef Ann Cooper

Spending about $1/day more per student, Cooper has shifted the lunch menu far away from the heavily-processed and factory-made food of most school lunches. Far more of the food is cooked in the district kitchens, albeit days in advance in some cases. I took Cooper’s word for it that the students actually eat the new food. This is a great improvement, in my opinion. The big questions are whether these changes are sustainable and what effect they will have.

The single best thing you can do for your health is to eat healthy food (the exact nature of which has yet to be determined, but you get the idea). Obesity, diabetes, heart disease, cancer, stroke — all the big American health problems are made much worse by the crummy diet of most Americans. Will Cooper’s improved lunches cause her lucky diners to eat better as adults? If so, $1/day is a great bargain compared to health care costs. (She estimated these changes will cost $2/day across the country.) Will Cooper’s improvements reduce obesity and diabetes? That is obviously the hope.

I wouldn’t say the Berkeley school lunch program is in trouble or in poor shape; I would say it is in limbo. Four things are big question marks:

1. Cooper seemed to be working very hard and not quite enjoying it. Even after a year on the job. This is not a good sign. Her salary is being paid by the Chez Panisse Foundation — not a good sign. She struck me as incredibly dedicated but how much failure and frustration can she and the Chez Panisse Foundation bear? This sort of thing is often much harder than anyone imagines in the beginning.

2. Obesity is a big big issue. Whether the new food will help is unknown. Cooper seems to take it on faith that her food will be less fattening. I am less sure. As anyone who has read The Shangri-La Diet knows, I believe that American food became really fattening not because it was processed or “unhealthy” but because of the increasing popularity of foods that tasted exactly the same each time (e.g., microwave entrees). If she cooks the same recipes again and again, the hoped-for weight loss may not happen. If it doesn’t, will the program continue? Or will $1/day be seen as better spent on something that hasn’t yet failed, such as more physical ed?

3. The effects of Cooper’s changes are going to be measured by UC Berkeley School of Public Health researchers. As far as I could make out, the comparison will be between Berkeley students and students in another school district. You have n = 1 (1 school district) in the experimental group and n = 1 (1 school district) in the control group. This is better than nothing but, given the importance of the question — can better school lunches improve health for the rest of a student’s life — and our great ignorance as to its answer, it is scary bad. It will be so easy to reach the wrong answer. Researchers with this sort of design often act as if they have hundreds of subjects in each group — each student is treated as a different and randomly-assigned subject. This isn’t just false, it’s misleading.

4. While I am sure the researchers can measure obesity, I am less sure they will do a decent job of measuring changes in attitudes toward food. It is not a typical public health question.

Chef Ann Cooper at work

I am very optimistic about the future of food — and therefore health — in America, but it’s because of (a) the Food Network, (b) the growth of farmers’ markets, and (c) the success of Whole Foods and similar stores. Not to mention Rachael Ray. Americans are becoming food connoisseurs, starting to catch up with a large chunk of the rest of the world, such as China. The American increase in connoisseurship is trickle-down — from rich people to everyone else. Like cell phones, like TVs, like literacy, like many things. Whereas Ann Cooper is working in a school district that has lots of poor people. Not a good place to start this sort of revolution.

Addendum: This article in New York magazine reminded me that Ann Cooper’s previous job was at an expensive private school. So maybe it is another case of trickle-down after all.

Is Sugar Fattening?

In 1987, Dr. Israel Ramirez, a researcher at the Monell Chemical Senses Institute, whose research led to the theory behind the Shangri-La Diet, questioned the prevailing assumption that sugar causes obesity in humans. Rat experiments did not support such a simple idea, he pointed out.

The most recent issue of the American Journal of Clinical Nutrition has a review article that agrees with Ramirez (but, alas, does not cite him). Now there is clinical evidence that Ramirez was right. From the abstract:

Numerous clinical studies have shown that sugar-containing liquids, when consumed in place of usual meals, can lead to a significant and sustained weight loss

Maybe the Shangri-La Diet isn’t so crazy.

How To Do Experiments That Generate Ideas

A few days ago a graduate student in economics asked me what I thought of behavioral economics. On the positive side, I said, some of the phenomena are impressive. For example, the endowment effect, which is so strong I would demonstrate it in class. On the negative side, none of the researchers use experiments to generate ideas. They don’t merely not do it; they seem unaware of the possibility of doing it. The graduate student wondered how it can be done. I said there were three main ways:

1. Do something extra. Do a little more than necessary so that your experiment tells you about something that isn’t the focus of interest. For example, vary a factor that you think is not important. This is Saul Sternberg’s idea. I did this in my peak-procedure experiments: measured how long rats held down the bar. This was irrelevant to the purpose of the experiments, which was to understand how rats measured time. These measurements greatly surprised me. For years, I misunderstood them. Eventually they led to a new line of research about the control of variability.

2. Measure a function, not a point. Ask how your treatment changes a whole function, not just this or that numerical measure. This is what I did in my peak procedure experiments: The experiments generated for every condition an entire function showing response rate as a function of time. I saw how treatments changed the entire function. This talk describes some of the new ideas this led to.

3. Make your experiment easy and fast. The easier and faster it is, the more you can do it in lots of variations. Our ignorance of behavior being great, some fraction of these are likely to generate unexpected – and therefore inspiring — results. This is one reason self-experimentation is good for generating ideas: It is easy and fast.

I am not aware of any other written answers to this question, strangely enough.