Methodological Lessons from Self-Experimentation (part 1 of 4)

On Tuesday (January 9) I am giving a talk about my self-experimentation to a group of interface designers who I hope will be interested in the broad methodological conclusions to be drawn from it. An audio file of the talk and the PowerPoint will be available but I think the most interesting stuff will be clearer and more accessible if I write it down. So here it is.

Usually we learn from our mistakes. This is the rare case where I learned from success — I expected my self-experimentation — to improve my sleep, to find effective ways to lose weight — to fail and was surprised and impressed when I was wrong. The seven lessons that follows (divided into four posts) are the broad conclusions I draw from what happened.

1. Do something. I started the long-term self-experimentation that led to my paper because I didn’t want to wake up too early for the rest of my life. I expected my little self-experiments to fail, and they did fail, but I didn’t realize that I would slowly learn from failure. I learned how to record my data, for instance, and how to analyze it. The effect of that learning was that my self-experimentation got better and better and after many years of failure I got somewhere. I think American culture teaches that success is good and failure bad, but the truth for scientists is that failure is good in the sense that you learn from your mistakes.

2. Keep doing something. I learned the value of drudgery. The research took many years. After my initial failures I continued not because I could see I was learning stuff — the learning was too slow to be perceptible — but for the same reason I started: I didn’t want to wake up early for the rest of my life. One of my students had been a classical musician. She said that her job had been athletic, not aesthetic. It involved great repetition of the same movements, like manual labor. Likewise, scientists often see science as something intellectually wonderful. I came to see it differently. Perhaps a question has one answer and there are 100 plausible alternatives. To find the answer you may just need to test each of the 100 possibilities. No way around it. That was roughly the position I was in trying to improve my sleep: There were many possibilities and no alternative to simply testing them one by one. (More complex experimental designs, such as factorial designs, were impractical.) There was nothing intellectually wonderful about it. “One thing nobody tells you about being a postdoc is that stuff that used to be fun for its own sake becomes tedious when you’ve done it hundreds or thousands of times,” blogged a postdoc.

Part 2 is here.

Note: You no longer need to register in order to comment.

Annals of Self-Experimentation: How to Fall Asleep Faster

Evan Dumas, our self-experimenter, does IT support in Portland, Oregon. He is 26 years old. As far back as he can remember, he has had trouble falling asleep. After he went to bed and turned off the light, it take an hour or more to fall asleep.

About a year ago, he tried a new solution: exercise just before bedtime. He had noticed that he fell asleep more quickly when he was tired (and of course exercise was tiring); and it was hard to exercise earlier in the day. He wondered if the standard advice don’t exercise close to bedtime was true. (For example, “finish your exercise at least three hours before bedtime,” says the National Sleep Foundation.)

His exercise consisted of slow push-ups, crunch-style sit-ups, and some static yoga positions that use the side muscles and back muscles. He continued until he was tired. In the beginning this took about 10 minutes; now it takes about 20 minutes.

The very first night he tried this, he fell asleep within minutes. Same with later nights: After exercise, he fell asleep “instantaneously,” he says — by which he means within about 5 minutes. Any doubt it was cause and effect was removed by evenings when he omitted the exercise, just to see if it was necessary. Without exercise, it again took him more than an hour to fall asleep. He also noticed that the exercise caused him to sleep less and wake up feeling more rested.

A great discovery. Surely we need far fewer sleeping pills.

To repeat what I said earlier: If you are interested in doing any self-experimentation, feel free to contact me for help. Also, please let me know the results; I would like to publicize other people’s self-experiments in this blog.

Is Drinking Olive Oil Healthy?

In Cities and the Wealth of Nations, Jane Jacobs wrote about an isolated North Carolina hamlet that her aunt visited in 1923:

One of my aunt’s tasks there was to see to construction of a church. . . One of the farmers donated, as a site, a beautiful knoll beside the river and my aunt suggested the building be made of fine large stones which were already quarried, as it were, needing little dressing, there for the taking in the creek and river beds. No, said the community elders, it was a pretty idea but not possible. . . . Entire walls and buildings of stone would not be safe.

These people came of a parent culture that had not only reared stone parish churches from time immemorial, but great cathedrals.

Likewise, nutritional wisdom is forgotten. Drinking olive oil now seems absurd to some people. But it was practiced in at least one place in the not-so-distant past:

In a mountain village in Crete, [Ancel] Keys saw old farmers working in the field who drank only a glass of olive oil for breakfast; he later verified that one of them was 106 years old.

From Todd Tucker, The Great Starvation Experiment, p. 204. There is a whole organization (Oldways) devoted to preserving ancient foodways and using them for nutritional guidance. The best practitioner of this approach has been Dr. Weston Price, a dentist, whose work is nicely summarized here. Dr. Price traveled the world looking for economically-primitive societies (“native peoples”) with ancient eating habits and excellent health. Their diets, especially the common elements, would suggest what a healthy diet must have.

Two of Dr. Price’s conclusions are relevant to the Shangri-La Diet:

1. “All native peoples studied made great efforts to obtain seafood.” This supports my comments about the importance of omega-3 fats, found much more in seafood than in other foods.

2. “The last major feature of native diets that Price found was that they were rich in fat, especially animal fat.” The animal fat in native diets would be high in omega-3 because the animals were eating grasses and other plants, not corn.

When I wrote my long paper on self-experimentation I divided it into two parts: one titled “Stone-Age Life Suits Us” (the common thread of the five examples), the other about weight control (the research behind SLD). The two parts struck me as quite different. Drinking sugar water to lose weight was definitely not a return to a Stone-Age lifestyle. But the big improvements in SLD since I wrote that paper — from sugar water to ELOO, and from ELOO to oils high in omega-3 — brought SLD much closer to the Stone-Age-Life-Suits-Us theme, I now see.

Going Flavorless

Gary Skaleski, the Wisconsin counselor who came up with nose-clipping (= eating food with your nose closed, especially with a swimmer’s nose clip), has tried eating all his food that way:

The last time I wrote to you I had started gaining again and not following the SLD as I should have been (off and on). However, since then I have been eating everything, all day long, without tasting anything (even coffee, diet soda)-avoiding [flavor] completely, but eating well. After a couple of days, the appetite suppression came back with a vengeance and am losing again.

What was the most interesting was the difficulty I had starting this, and the sense of loss/regret and avoidance I had to doing it, and not being able to [smell] anything. While I recommended this procedure for others, I avoided it myself. But now I am on day 3 of [flavorlessness] and am doing well. . . . Interesting new needs come up-need for something crunchy, something smooth tasting, etc. . . . does help one focus on the feeling of different foods while eating, as well as becoming more sensitive to real hunger feelings (amazed at how much taste runs one’s eating).

He believes, as do I, that this may be useful in extreme cases. Let’s compare gastric bypass surgery (GSB) and eating like this (NC, for nose-clipping) on several dimensions. Dangerous? GSB: very. NC: no. Reversible? GSB: no. NC: yes. Adjustable? GSB: no. NC: very. You can do it every other day, for example. You can nose-clip some foods but not others. Cost? GSB: $20,000 or more. NC: $5 (swimmer’s nose clips).

Science in Action: Procrastination (results)

It worked. This became:

My kitchen table a little later

The clearing took about 40 minutes of work and three games of Sudoku. Now to test the broken-windows theory of neatness, which says that things stay decent (say, a few items on a table) so long as the disorder stays below a certain threshold. Below that threshold, a natural tendency keeps things neat. Above that threshold, it malfunctions.

Science in Action: Procrastination

A month ago I had lunch with Greg Niemeyer, a professor of art at UC Berkeley whose medium is games. His games have appeared in art galleries all over the world. He asked me if games had been studied by psychologists and pointed out some of their psychological properties — the power to make you concentrate for a long time, for example.

This was fascinating. He was so right — games are powerful in several ways. I wondered how that power could be (a) studied and (b) used. My first question was whether games could be a stimulant, like caffeine. I emailed Greg about this; he suggested I try Bejeweled and Sudoku. But I found them tiring — they require concentration. My next idea was that maybe I could use games as a reward. I used to enjoy Tetris and Freecell. If I do X (something I wouldn’t otherwise do), then I get to play a game. This contingency causes me to do X. There are dozens of rewards you could use this way (listening to music, eating a piece of chocolate, etc.); the advantages of games include their number and variety, the care put into them, the lack of satiation (you can play the game many times and it remains pleasant), their harmlessness (if I avoided getting addicted), their low cost, the ready supply (you can play a computer game whenever you have a computer), and the short duration of some of them. The reward for a 5-minute task should not last 4 hours.

I have wondered for a long time about procrastination — what causes it, what to do about it. I like to think I’ve figured out a few things but even so certain things I should do seem to go undone . . . well, forever.

For example, a month ago I had 40-odd emails in my inbox, some a few months old. I never got around to clearing it out. Bejeweled was no fun but Sudoku (Easy level) was okay. I never played Sudoku for fun but it was slightly enjoyable. Maybe I could play a game of Sudoku as reward for answering email. If I made the requirement — the amount of email that I needed to answer — small enough, it might work.

It worked. When I made the requirement tiny — deal with 3 email (which might take 10 minutes) — that was small enough. And I was able to do it again and again: handle 3 email, play Sudoku, handle 3 email, play Sudoku, etc. Progress was slow — I spent more time playing Sudoku than dealing with email — but slow progress was far better than no progress. I was a little stunned it was actually working. After about 10 cycles (which took 3 or 4 hours), my inbox was as empty as I could make it. It hadn’t been that empty in years. To gather some data about the whole process I wrote some R programs for recording what the task was, how long it took, etc.

Then I started spending all my time revising The Shangri-La Diet for the paperback edition. A few days ago I finished that. My inbox had gotten full again and again I used Sudoku to clear it out.

I want to learn more about this way of getting things done. Does it work with other chores besides email? Here is the kitchen table in my apartment:

My Kitchen Table 26 December 2006 8 am

It isn’t usually this messy but it hasn’t been completely clear for years. Can I use Sudoku to clear it off?

Why I Like Self-Experimentation

Self-experimentation, like blogs, Wikipedia, and open-source software (and before them, books) gives outsiders far more power. This took me a long time to figure out. For years, I liked self-experimentation for five reasons:

1. It worked. It reduced my acne, improved my sleep, and enabled me to lose plenty of weight. This surprised me. I am a professional scientist. My professional experiments, about animal learning, generally worked, but never had practical value.

2. It had unexpected benefits. I discovered accidentally that seeing faces in the morning improved my mood the next day. Better sleep (from self-experimentation) improved my health.

3. It was easy. What I did never involved more than small changes in my life. Even standing 8 hours per day wasn’t hard, after a few days.

4. My conclusions fit what others had found — usually, facts that didn’t fit mainstream views. For example, the fact that depression is often worst in the morning and gets better throughout the day doesn’t fit the conventional view that depression is a biochemical disorder but does fit my idea that depression is often due to a malfunctioning circadian oscillator. Self-experimentation seemed to be pointing me in correct directions.

5. My conclusions were surprising. That breakfast is bad (for sleep), the effect of faces on mood, and the Shangri-La Diet are examples.

Recently, though, the rise of blogging, Wikipedia, and open-source software, showed me the power of a kind of multiplicative force: (pleasure of hobbies) multiplied by (professional skills). Blogging, for example: (people enjoy writing) multiplied by (professional expertise, which gives them something interesting and unusual to say). In other words, expertise and job skills used in a hobby-like way. My self-experimentation, I realized, was another example: I used my professional (scientific) skills to solve everyday problems. My self-experimentation was like a hobby in that I did it year after year without financial reward or recognition. It was its own reward. The hobby aspect — persistence, freedom to try anything, no need for recognition or payment — made it powerful. I could go in depth where professionals couldn’t go at all.

But I was still missing something — something obvious to many others. The power of blogging isn’t

(hobby) x (job skills).

That’s just one person. The total power of blogging is

(hobby) x (job skills) x (anyone can do it)

Which is very powerful. Finally I saw there was a sixth reason to like self-experimentation:

6. Anyone can do it.

As Aaron Swartz Read more “Why I Like Self-Experimentation”

Books Were the First Open-Source Software

Here is Aaron Swartz on Wikipedia:

When you put it all together, the story becomes clear: an outsider makes one edit to add a chunk of information [to a Wikipedia entry], then insiders make several edits tweaking and reformatting it. In addition, insiders rack up thousands of edits doing things like changing the name of a category across the entire site — the kind of thing only insiders deeply care about. As a result, insiders account for the vast majority of the edits. But it’s the outsiders who provide nearly all of the content.

(Correcting Wikipedia’s founder, by the way.) When I visited my editor, Marian Lizzi, at Penguin, I realized that book publishing is exactly the same: Outsiders write the books, insiders edit them.

The curious thing about book publishing is similar to what Swartz noticed in a different realm: The content, the crucial stuff, is entirely from amateurs. No other industry, with the possible exception of craft shows, is like this. If I run a deli, I buy supplies and food from people who make their living selling supplies and food. If I make clothes, I buy my cloth from professional cloth makers. If I make cheese, my milk comes from professional farmers. Only book publishers endlessly deal with amateurs.

continued

The Wisdom of Experts: John Chambers on Research Design

John Chambers, a retired Bell Labs statistician and one of the persons most responsible for R, the free open-source data analysis package I use, told me an interesting story yesterday. AT&T used to make microchips. The “yield” of chips — the percent of chips that were defect-free — was very important. Chambers and other Bell Labs statisticians were asked to help the chip makers improve their manufacturing process by increasing the yield. At the chip factory, the people Chambers and his colleagues spoke to were chemists and engineers. They wanted to do experiments that varied voltage, temperature, and similar variables. Chambers and his colleagues had a hunch that the operator — the person running the fabrication machines — was important, and this turned out to be true.

I like this story because it has a wisdom-of-crowds-but-not-exactly twist: the supposed experts at one thing (data analysis) turned out to have useful (and unpredictable) knowledge about something else. We don’t think of statisticians as experts in human behavior but in this case they were at least more expert than the chemists and engineers. I mean: who were the experts here? And when we deal with someone, which is more likely: We overestimate how much they can help us with our problem? Or we underestimate (as in this story, where the chip makers underestimated the statisticians)? And if we have no idea which it is, how might we find out?

I told Chambers that statisticians were hurt by the name of their department: statistics. It puts them in too-small a box. John Tukey’s term data analysis (in place of statistics) was an improvement, yes, but only a bit; it would be a lot better if they were called how-to-do-research departments. Yes, Chambers said, that would be an improvement.

I am fascinated by the similarity between three things:

1. Data analysis. Much of data analysis consists of putting data together in a way that allows you to extract a little bit of information from each datum. These little piece of information, added together, can be quite informative. A scatterplot, for example.

2. Wisdom-of-crowds phenomena. For example, many people guess the weight of a cow. The average of their guesses is remarkably accurate, even though the variation in guesses is large.

3. Self-experimentation. The new and interesting feature of my self-experimentation was that it involved my everyday life. From activities I was going to do anyway (such as eat and sleep), I managed to extract useful information.

In each case it’s like extracting gold from seawater: You get something of value from what seemed useless. Are there other examples? How can we find new examples? Chamber’s story suggests one direction: Making some small change so that you learn from your co-workers about stuff you wouldn’t think they could teach you about.

Too Few Riders, Too Many Stolen Bases

I heard two excellent talks last week. Bent Flyvbjerg, a professor of Planning at Aalborg University, Aalborg, Denmark , spoke on “Survival of the Unfittest: Why the Worst Megaprojects [subways, airports, bridges, tunnels] Get Built.” Why? Because of false claims. Cost estimates turn out to be much too low and benefit estimates (such as ridership) much too high. Boston’s Big Dig, for example, has already cost more than three times the original estimate. Cost estimates were too low in 90% of projects, Flyvbjerg said. The tools used to make those estimates have supposedly improved a great deal over the last few decades but their accuracy has not improved. Lovallo and Kahneman have argued that the underlying problem is “ optimism bias“; however, Flyvbjerg believes that the problem is what he now calls strategic misrepresentation — when he used the term lying people got upset. The greater the misrepresentation, the more likely the project would be approved — or rather the greater the truth the more likely the project would not be approved. That is a different kind of bias. An everyday example is me and my microwave oven. Sometimes I use my microwave oven to dry my clothes. I’ve done this dozens of times but I continue to badly underestimate how long it will take. I guess that a shirt will take 8 minutes to dry; it takes 15 minutes. I know I underestimate — but I keep doing it. This is not optimism bias. Microwaving is not unexpectedly difficult or unpredictable. The problem, I think, is the asymmetry of the effects of error. If my guess is too short, I have to put the shirt back in the microwave, which is inconvenient; if my guess is too long the shirt may burn — which corresponds to the project not being approved.

Incidentally, Flyvjberg has written a paper defending case studies and by extension self-experimentation. He quotes Hans Eysenck, who originally dismissed case studies as anecdotes: “Sometimes we simply have to keep our eyes open and look carefully at individual cases — not in the hope of proving anything but rather in the hope of learning something.” Exactly.

The other excellent talk (”Scagnostics” — scatterplot diagnostics) was by Leland Wilkinson, author of The Grammar of Graphics and developer of SYSTAT, who now works at SPSS. He described a system that classifies scatterplots. If you have twenty or thirty measures on each of several hundred people or cities or whatever, how do you make sense of it? Wilkinson’s algorithms measure such properties of a scatterplot as its texture, clumpiness, skewness, and four others I don’t remember. You use these measures to find the most interesting scatterplots. He illustrated the system with a set of baseball statistics — many measurements made on each of several hundred major-league baseball players. The scatterplot with the most outliers was stolen bases versus age. Stolen bases generally decline with age but there are many outliers. Although a vast number of statistical procedures assume normal distributions, Wilkinson’s tools revealed normality to be a kind of outlier. In the baseball dataset, only one scatterplot had both variables normally distributed: height versus weight. These tools may eventually be available with R.