Worse Than Placebo? Forest Laboratories’s Shameful Marketing


While Forest [Laboratories] applied to the FDA for pediatric use of Celexa [the anti-depressant] and was eventually denied, the company admitted it had marketed the drug to doctors by hiring speakers to tout its benefits for young patients. Forest also admitted it had suppressed the negative results of research in Europe that found Celexa was no more effective in treating depressed children and adolescents than a sugar pill. Fourteen young patients in that study attempted suicide or contemplated suicide, compared with five in the placebo group, court records show.

From this article. Is Forest Laboratories worse than other big drug companies? Probably not. What’s horrible is how this sort of thing — suppression of negative results — keeps happening. It suggests that the evaluation of drugs should be taken entirely out of the hands of drug companies.

Percentile Feedback Update

In March I discovered that looking at a graph of my productivity (for the current day, with a percentile attached) was a big help. My “efficiency” — the time spent working that day divided by the time available to work — jumped as soon as the new feedback started (as this graph shows). The percentile score, which I can get at any moment during the day, indicates how my current efficiency score ranks according to scores from previous days within one hour of the same time. For example, a score of 50 at 1 p.m. means that half of the previous days’ scores from noon to 2 p.m. were better, half worse. The time available to work starts when I get up. For example, if I got up at 4 a.m., at 6 a.m. there were 2 hours available to work. The measurement period usually stops at dinner time or in the early evening.

This graph shows the results so far. It shows efficiency scores at the end of each day. (Now and then I take a day off.) One interesting fact is I’ve kept doing it. The data collection isn’t automated; I shift to R to collect it, typing “work.start” or “work.stop” or “work.switch” when I start, stop, or switch tasks. This is the third or fourth time I’ve tried some sort of work tracking system and the first time I have persisted this long. Another interesting fact is the slow improvement, shown by the positive slopes of the fitted lines. Apparently I am slowly developing better work habits.

The behavioral engineering is more complicated than you might think. My daily activities naturally divide into three categories: 1. things I want to do but have to push myself to do. This helps with that, obviously. 2. things I don’t want to do a lot of but have to push myself away from (e.g., web surfing). 3. things I want to do and have no trouble doing. But the recording system is binary. What do I do with activities in the third category? Eventually I decided to put the short-duration examples (e.g., standing on one foot, lasts 10 minutes) in the first category (counts as work), keeping the long-duration examples (e.g., walking, might last one hour) in the second category (doesn’t count as work).

Before I started this I thought of a dozen reasons why it wouldn’t work, but it has. In line with my belief that it is better to do than to think.

Phone Hacking and Jane Jacobs

I am fascinated by the British phone hacking scandal. Jane Jacobs has helped me understand it.

Should police officers be paid per arrest? Most people think this is a bad idea, I imagine, but the larger point (what can we learn from this?) isn’t clear. In Systems of Survival, Jacobs tried to spell out the larger point. She wrote about two sets of moral rules. One set (“guardian syndrome”) applied to warriors, government officials, and religious leaders. It prizes loyalty and obedience, for example. The other set (“commercial syndrome”) applied to merchants. It prizes honesty, avoidance of force, and industriousness, for example. The two syndromes correspond to two ways of making a living: taking and trading. The syndromes reached the form they have today because they worked — different jobs need different rules. When people in one sort of work (e.g., guardian) follow the rules of the other, things turn out badly. Ayn Rand glorified the commercial syndrome. When Alan Greenspan, one of her acolytes, became a governor, he did a poor job.

What about journalists? As a journalistic business becomes more powerful, it becomes more guardian-like. A powerful newspaper isn’t inherently bad; we want a powerful newspaper to keep other powerful institutions (government, large businesses) in check. Murdoch’s News International, of course, has became very powerful. Yet Murdoch newsrooms retained commercial norms, especially an emphasis on selling many copies. Reporters in Murdoch newsrooms were under intense pressure to produce — like policemen paid per arrest. Other journalists, with guardian norms (e.g., at the New York Times), didn’t like the commercial norms of Murdoch newspapers. The mixture of commercial values and guardian power led to the phone hacking scandal. Friends of mine blame Murdoch himself — but commercial norms are not unique to Murdoch. The problem is their mixture with great power.

When newspapers are small, they are not powerful, not guardians, and must adopt commercial norms — they must try to sell more copies or they will be crushed. When a small newspaper becomes large and powerful, however, its norms must change to guardian ones or things will turn out badly. This suggests that the phone-hacking scandal happened because Murdoch became very powerful too fast — too fast for a shift in values to accompany much greater power.

 

 

 

 

 

 

 

 

 

 

Assorted Links

Better To Do Than To Think

The most important thing I learned in graduate school — or ever — about research is: Better to do than to think. By do I mean collect data. It is better to do an experiment than to think about doing an experiment, in the sense that you will learn more from an hour spent doing (e.g., doing an experiment) than from an hour thinking about what to do. Because 99% of what goes on in university classrooms and homework assignments is much closer to thinking than doing, and because professors often say they teach “thinking” (“I teach my students how to think”) but never say they teach “doing”, you can see this goes against prevailing norms. I first came across this idea in an article by Paul Halmos about teaching mathematics. Halmos put it like this: “The best way to learn is to do.” When I put it into practice, it was soon clear he was right.

I have never heard a scientist say this. But I recently heard a story that makes the same point. A friend wrote me:

I met Kary Mullis after high school. I knew that PCR was already taught in some high schools (like mine) and was curious how he discovered it. He said that he had some ideas about how to make the reaction work and discussed them with others, who explained why it wouldn’t work. He wasn’t insightful enough to understand their explanations so he had to go to the lab and see for himself why it wouldn’t work. It turned out it worked.

An example of better to do than to think.

Better to do than to think is not exactly anti-authoritarian but it is close. I was incredibly lucky to learn it from Halmos. It isn’t obvious how else I might have learned it. It took me many years to learn Research Lesson #2: Do the smallest easiest thing. And I learned this only because of all my self-experimentation. I started doing self-experimentation because of better to do than to think.

 

Assorted Links

Thanks to Tim Lundeen, J.C. and Ben Casnocha.

The Willat Effect: Side-by-Side Comparisons Create Connoisseurs

About ten years ago, while visiting my friend Carl Willat, he presented me with five versions of limoncello (an Italian lemon liqueur) side by side in shot glasses. Two were store-bought, the rest homemade, if I remember correctly. I tried them one by one. I had had limoncello many times but never different versions side by side. It was easy to notice differences between them. Obviously. What surprised me was an hedonic reaction: I thought two of them (with more complex flavors) were wonderful and one (store-bought) was awful. Both reactions (wonderful and awful) were stronger than usual. In a small way, I’d become a connoisseur. After that, I was happy to buy expensive limoncello (e.g., $26). I no longer bought cheap limoncello ($18). I call the hedonic changes produced by side-by-side comparisons the Willat Effect. Carl became a connoisseur of Italian hand-painted tableware due to side by side comparisons. I believe connoisseurs were important in human evolution because they helped support skilled artisans. Our design preference for repeated elements (e.g., wallpaper, textiles) evolved so that we would put similar things side by side.

I mentioned a downside of the Willat Effect a few posts ago:

Five or six years ago I went to a sake-tasting event in San Francisco called “The Joy of Sake”. About 140 sakes. In a few hours I became such a sake connoisseur that the sake I could afford — and used to buy regularly — I now despised. The only sake I now liked was so expensive ($80/bottle) that I never bought another bottle of sake.

A reader named James Bailey commented:

And you still go to tastings?? It seems like ignorance is bliss here, better to preserve your ability to enjoy cheap things.

Yes, I still go to tastings. The sake tasting was the only one that had that effect. Mostly they have no effect because the samples vary too much. For example, I’ve been to many wine tastings but haven’t become much of a wine connoisseur. The many wines at the tastings were all over the place. If I want to get the effect, I usually have to do it myself: buy several versions of a product and try them side by side. I recently did this for whiskey. When I go back to Beijing maybe I’ll do it for some sort of tea.

When I do it myself I control the price range and limit the high end to what I can afford. I didn’t buy $80 whiskeys, for example, although many were available. So the effect makes me enjoy stuff at the upper end of what I’ll pay. When I became an assistant professor, I thought it would be fun to enjoy fine art (e.g., paintings) more. I attended several art history classes. They had no effect — I was bored. Side-by-side comparisons, in contrast, actually work and, as Carl illustrated, are easily shared. And they are consumerist and artisanal at the same time.

Causal Reasoning in Science: Don’t Dismiss Correlations

In a paper (and blog post), Andrew Gelman writes:

As a statistician, I was trained to think of randomized experimentation as representing the gold standard of knowledge in the social sciences, and, despite having seen occasional arguments to the contrary, I still hold that view, expressed pithily by Box, Hunter, and Hunter (1978) that “To find out what happens when you change something, it is necessary to change it.”

Box, Hunter, and Hunter (1978) (a book called Statistics for Experimenters) is well-regarded by statisticians. Perhaps Box, Hunter, and Hunter, and Andrew, were/are unfamiliar with another quote (modified from Beveridge): “Everyone believes an experiment except the experimenter; no one believes a theory except the theorist.”

Box, Hunter, and Hunter were/are theorists, in the sense that they don’t do experiments (or even collect data) themselves. And their book has a massive blind spot. It contains 500 pages on how to test ideas and not one page — not one sentence — on how to come up with ideas worth testing. Which is just as important. Had they considered both goals — idea generation and idea testing — they would have written a different book. It would have said much more about graphical data analysis and simple experimental designs, and, I hope, would not have contained the flat statement (“To find out what happens …”) Andrew quotes.

“To find out what happens when you change something, it is necessary to change it.” It’s not “necessary” because belief in causality, like all belief, is graded: it can take on an infinity of values, from zero (“can’t possibly be true”) to one (“I’m completely sure”). And belief changes gradually. In my experience, significant (substantially greater than zero) belief in the statement A changes B usually starts with the observation of a correlation between A and B. For example, I began to believe that one-legged standing would make me sleep better after I slept unusually well one night and realized that the previous day I had stood on one leg (which I almost never do). That correlation made one-legged standing improves sleep more plausible, taking it from near zero to some middle value of belief (“might be true, might not be true”) Experiments in which I stood on one leg various amounts pushed my belief in the statement close to one (“sure it’s true”). In other words, my journey “to find out what happens” to my sleep when I stood on one leg began with a correlation. Not an experiment. To push belief from high (say, 0.8) to really high (say, 0.99) you do need experiments. But to push belief from low (say, 0.0001) to medium (say, 0.5), you don’t need experiments. To fail to understand how beliefs begin, as Box et al. apparently do, is to miss something really important.

Science is about increasing certainty — about learning. You can learn from any observation, as distasteful as that may be to evidence snobs. By saying that experiments are “necessary” to find out something, Box et al. said the opposite of you can learn from any observation. Among shades of gray, they drew a line and said “this side white, that side black”.

The Box et al. attitude makes a big difference in practice. It has two effects:

  1. Too-complex research designs. Just as researchers undervalue correlations, they undervalue simple experiments. They overdesign. Their experiments (or data collection efforts) cost far more and take much longer than they should. The self-experimentation I’ve learned so much from, for example, is undervalued. This is one reason I learned so much from it — because it was new.
  2. Existing evidence is undervalued, even ignored, because it doesn’t meet some standard of purity.

In my experience, both tendencies (too-complex designs, undervaluation of evidence) are very common. In the last ten years, for example, almost every proposed experiment I’ve learned about has been more complicated than I think wise.

Why did Box, Hunter, and Hunter get it so wrong? I think it gets back to the job/hobby distinction. As I said, Box et al. didn’t generate data themselves. They got it from professional researchers — mostly engineers and scientists in academia or industry. Those engineers and scientists have jobs. Their job is to do research. They need regular publications. Hypothesis testing is good for that. You do an experiment to test an idea, you publish the result. Hypothesis generation, on the other hand, is too uncertain. It’s rare. It’s like tossing a coin, hoping for heads, when the chance of heads is tiny. Ten researchers might work for ten years, tossing coins many times, and generate only one new idea. Perhaps all their work, all that coin tossing, was equally good. But only one researcher came up with the idea. Should only one researcher get credit? Should the rest get fired, for wasting ten years? You see the problem, and so do the researchers themselves. So hypothesis generation is essentially ignored by professionals because they have jobs. They don’t go to statisticians asking: How can I better generate ideas? They do ask: How can I better test ideas? So statisticians get a biased view of what matters, do biased research (ignoring idea generation), and write biased books (that don’t mention idea generation).

My self-experimentation taught me that the Box et al. view of experimentation (and of science — that it was all about hypothesis testing) was seriously incomplete. It could do so because it was like a hobby. I had no need for publications or other steady output. Over thirty years, I collected a lot of data, did a lot of fast-and-dirty experiments, noticed informative correlations (“accidental observations”) many times, and came to see the great importance of correlations in learning about causality.

 

 

 

 

 

 

 

 

New Support for Prenatal Ultrasound Cause of Autism

I have blogged several times about Caroline Rodger’s idea that sonograms during pregnancy greatly increase the risk of autism in the fetus. Her idea is supported by several lines of evidence, as she explains in this talk.

A new study provides more evidence. It found a high concordance rate among fraternal twins. In the general population from which the new study was drawn (California), about 1 child in 100 has autism. But if you are an identical twin, and your co-twin has autism spectrum disorder, your chance of having the same diagnosis is about 70%. The crucial point of the study is that the concordance was also high for fraternal twins: about 40%. As one commenter put it, this result “puts a spotlight on pregnancy as a time when environmental factors might exert their effects”.

Another study found more risk of autism if the mom took an anti-depressant during pregnancy. This supports the idea that a bad prenatal environment causes autism.

Thanks to Paul Sas and Gary Wolf.

Great Delusions: James Watson

In an interview, James D. Watson, co-discovery of the structure of DNA, said

Some day a child is going to sue its parents for being born. They will say, My life is so awful with these terrible genetic defects.

(Quoted by Richard Bentall in Doctoring the Mind.) Watson is implying that genetic defects matter in the big picture of human impairment. They don’t. Changes over time in disease incidence, migration studies (in all instances I know of, the disease profile of the migrating group changes to match the place where they live), powerful nutritional effects (e.g., Weston Price) and other evidence of environmental potency show that all major diseases (heart disease, cancer, depression, obesity, plague, tuberculosis, smallpox, etc.) are mostly caused by the environment, in the sense that environmental changes could greatly reduce their incidence. Genes are a distraction. (To say that major diseases are also “caused” by genes in the sense that genes affect environmental potency is to miss the point that we want to reduce the diseases — want to reduce obesity for example — so it is the environmental lever that matters. If a child could eliminate its obesity by changing its environment, it would not sue its parents.) If Watson was unaware of that, okay. But for him to claim the opposite is a great — and I am afraid profoundly self-serving — delusion.

As I blogged, Aaron Blaisdell had a certifiably “genetic” disease. The chromosome involved had been identified. It turned out to be under nutritional control. When he improved his diet, it vanished. Calling it “genetic” seriously distracted from learning how to eliminate it, it turned out. Another example of how “genetic” problems are not what they seem — impossible to change — is provided by lactose intolerance. The rate of lactose intolerance varies greatly from group to group. (I thank Phil Price for the link.) It is rare in Sweden, common in Asia, including China. I assume these differences reflect genetic differences. Yet Beijing supermarkets have aisles full of milk products. How can that be? Because the aisles are full of yogurt. Yogurt bacteria digest the lactose. So lactose intolerance is not a big deal. You can still drink milk, after it has been predigested by bacteria.

The dreams of geneticists.