Exploratory Versus Confirmatory Data Analysis?

February 15, 2010March 19, 2024 Seth Roberts

In 1977, John Tukey published a book called Exploratory Data Analysis. It introduced many new ways of analyzing data, all relatively simple. Most of the new ways involved plotting your data. A few involved transforming your data. Tukey’s broad point was that statisticians (taught by statistics professors) were missing a lot: Conventional statistics focussed too much on confirmatory data analysis (testing hypotheses) to the omission of exploratory data analysis — data analysis that might show you something new. Here are some tools to help you explore your data, Tukey was saying.

No question the new tools are useful. I have found great benefits from plotting and transforming my data. No question that conventional statistics textbooks place far too little emphasis on graphs and transformations. But I no longer agree with Tukey’s exploratory versus confirmatory distinction. The distinction that matters — at least to historians, if not to data analysts — is between low-status and high-status. A more accurate title of Tukey’s book would have been Low-Status Data Analysis. Exploratory data analysis already had a derogatory name: Descriptive data analysis. As in mere description. Graphs and transformations are low-status. They are low-status because graphs are common and transformations are easy. Anyone can make a graph or transform their data. I believe they were neglected for that reason. To show their high status, statistics professors focused their research and teaching on more difficult and esoteric stuff — like complicated regression. That the new stuff wasn’t terribly useful (compared to graphs and transformations) mattered little. Like all academics — like everyone — they cared enormously about showing high status. It was far more important to be impressive than to be useful. As Veblen showed, it might have helped that the new stuff wasn’t very useful. “Applied” science is lower status than “pure” science.

That most of what statistics professors have developed (and taught) is less useful than graphs and transformations strikes me as utterly clear. My explanation is that in statistics, just as in every other academic area I know about, desire to display status led to a lot of useless highly-visible work. (What Veblen called conspicuous waste.) Less visibly, it led to the best tools being neglected. Tukey saw the neglect –Â underdevelopment and underteaching of graphs, for example — but perhaps misdiagnosed the cause. Here’s why Tukey’s exploratory versus confirmatory distinction was misleading: Because the tools that Tukey promoted for exploration also improve confirmation. They are neglected everywhere. For example:

1. Graphs improve confirmatory data analysis. If you do a t test (or compute a p value in any way) but don’t make an associated graph, there is room for improvement. A graph will show whether the assumptions of the computation are reasonable. Often they aren’t.

2. Transformations improve confirmatory data analysis. That a good transformation will make the assumptions of the test more reasonable many people know. What few people seem to know is that a good transformation will make the statistical test more sensitive. If a difference exists, the test will be more likely to detect it. This is like increasing your sample size at no extra cost.

3. Exploratory data analysis is sometimes thought of as going beyond the question you started with to find other structure in the data — to explore your data. (Tukey saw it this way.) But to answer the question you started with as well as possible you should find all the structure in the data. Suppose my question is whether X has an effect.Â I should care whether Y and Z have an effect in order to (a) make my test of X more sensitive (by removing the effects of Y and Z) and (b) assess the generality of the effect of X (does it interact with Y or Z?).

Most statistics professors and their textbooks have neglected all uses of graphs and transformations, not just their exploratory uses. I used to think exploratory data analysis (and exploratory science more generally) needed different tools than confirmatory data analysis and confirmatory science. Now I don’t. A big simplification.

Exploration (generating new ideas) and confirmation (testing old ideas) are outputs of data analysis, not inputs. To explore your data and to test ideas you already have you should do exactly the same analysis. What’s good for one is good for the other.

Likewise, Freakonomics could have been titled Low-status Economics. That’s essentially what it was, the common theme. Levitt studied all sorts of things other economists thought were beneath them to study. That was Levitt’s real innovation — showing that these questions were neglected. Unsurprisingly, the general public, uninterested in the status of economists, found the work more interesting than high-status economics. I’m sensitive to this because my self-experimentation was extremely low-status. It was useful (low-status), cheap (low-status), small (low-status), and anyone could do it (extremely low status).

More Andrew Gelman comments. Robin Hanson comments.

33 thoughts on “Exploratory Versus Confirmatory Data Analysis?”

1 says:

February 15, 2010 at 12:00 am

I spoke with a statistics professor at Berkeley about this book. On her website, it says she studies “multilevel and latent variable modeling.” She said Tukey’s book is “not important” and she mentioned something funny about Tukey’s life. She did say, “Are you interested in statistics?” Thanks, Seth, for making me seem smart to these Berkeley profs!! I always talk to them about something I learned from you and your blog!!

Reply
seth says:

February 15, 2010 at 12:00 am

what was the funny thing about Tukey’s life?

Tukey’s book was really important to me because it stressed two things (graphs & transformations) that my other statistics textbooks did not. They turned out to be incredibly useful.

Reply
jay says:

February 15, 2010 at 12:00 am

Hey Seth, have you seen this?

https://thelastpsychiatrist.com/2009/02/the_bubble_in_academic_researc.html

Sounds similar to a lot of what you’ve been saying lately (which is good, because people independently arriving at similar conclusions… is good).

Reply
q says:

February 15, 2010 at 12:00 am

how do you draw a graph with more than two or three variables?

Reply
seth says:

February 15, 2010 at 12:00 am

lattice plots allow more than three variables to be visualized. For example, one scatterplot shows X vs Y. And a 2 x 2 matrix of X-Y scatterplots shows how that relationship varies with W (rows) and Z (columns). So you get up to 4 dimensions easily enough. More than 4 dimensions is hard. Gotta use ANOVA to figure out what graphs to make.

Reply
seth says:

February 15, 2010 at 12:00 am

jay, thanks for the link. Nice post. Although psychiatric research has done little for the general public, it has done wonders for the status of the psychiatrists who publish it (within their profession). That’s why it’s not a bubble. It really pays off — just not for the rest of us. Basically the same situation as most statistics research, I agree.

Reply
1 says:

February 15, 2010 at 12:00 am

Seth, she said Tukey had a strange life.

Reply
LemmusLemmus says:

February 15, 2010 at 12:00 am

I can’t agree on the last paragraph. Much of the research that was popularized in Freakonomics was published in the most renowned journals economics has: Definitely high status.

Reply
1 says:

February 15, 2010 at 12:00 am

I believe for bipolar disorder, there hasn’t been a lot of new medications out in the last ten years. Or at least, to my knowledge. Probably true for schizophrenia medication (antipsychotics), as well. In addition, some of them can be hard to tolerate, all have side effects, etc. It is too bad, but some people have no other choice and have to take these psychiatric medications. I think that’s why there’s therapists and psychologists who devote their careers to helping the mentally ill cope with these things. They’ve developed different ideas such as CBT and ACT. Together with the psych meds, these can be more powerful than psych meds alone.

Reply
seth says:

February 15, 2010 at 12:00 am

LemmusLemmus, my self-experimentation was published in a high-status journal. That didn’t change the basic picture. I believe that most econ profs in high-rated depts believed that determining the income of drug dealers was low-status. The abortion stuff, not so low-status. Sumo wrestlers = low status.

Reply
Mike Bowerman says:

February 15, 2010 at 12:00 am

What is interesting to me about the low-status/high-status distinction when you extend it to academic work is that much of the low-status academic work could also offer more value to society than the high-status work that is instead undertaken — like more lessons from self-experimentation would be of value than expensive clinical trials, or research on prevention of disease might be of greater social value than high-status pharmaceutical or surgical interventions.

Is much of the research in Freakonomics also of value to society, or simply popular in the way that other novelties are popular — e.g., information on Britney Spears’ personal life is popular, but not valuable to society. I didn’t read Freakonomics because in the multiple excerpts I read the only valuable insights I found were the relationship between abortion and later crime rates and the low-incomes of drug dealers on the street — both of which were fascinating and potentially valuable, but covered thoroughly in other sources. Was the ‘sumo wrestlers’ for instance useful? I don’t recall reading about it.

Reply
Mike Bowerman says:

February 15, 2010 at 12:00 am

Veblen was also said to have a “”strange life” — likely a good sign for Tukey.

Reply
seth says:

February 15, 2010 at 12:00 am

Michael, the broad point of Freakonomics — that data is useful, that it can change your mind — is quite useful. Whether this is a low-status point to make I’m not entirely sure but many famous economists have been far less interested in data collection than Levitt.

Reply
Socktopi says:

February 16, 2010 at 12:00 am

Steven Levitt is a Clark Medalist from the University of Chicago, easily the most influential Economics department in the world. The idea that he is some outsider doing low status work that the rest of the field disdains is nonsense. His prominence in the field is what allows him to study cheating in sumo wrestling and ghetto baby names, instead of unemployment and inflation. That is to say, exactly the kind of esoteric and impractical status signaling work you deplore in every other academic.

Reply
M says:

February 16, 2010 at 12:00 am

I think this post is great and agree in general. But I’ll chime in and agree that Steven Levitt is about as “high status” as you can get within the academic economics community. The Clark Medal is only given out once every two years (as opposed to one Nobel a year). I think one of the reasons that he is high status is because he has used fairly conventional econometric techniques to “colonize” areas not traditionally the realm of economists. (Eg, what would have been considered the realm of sociology.)

Reply
seth says:

February 16, 2010 at 12:00 am

Socktopi & M, you make a good point that perhaps I should have made. (In an earlier draft, I did.) It’s like Nixon and China. That his anti-communist credentials were secure made it easier for him to go to China. Long before Exploratory Data Analysis, John Tukey’s very high status was assured. He was a co-inventor of the Fast Fourier Transform, for example. I’m sure he was utterly unconcerned how EDA would affect his perceived status. (As L says, it didn’t help. It really did get a scornful reception from some high-status statistics professors.) Likewise with Levitt. Just as you say, Socktopi, Levitt’s very high status made it easier for him to do low-status research.

I wouldn’t call the stuff Levitt studied “esoteric”. For the field of economics, they are esoteric topics but for the general public they are common concerns: What to name our baby? for example. You could say that by doing such research, Levitt signaled his extremely high status — just as Tukey did, just as Nixon signalled his extreme anti-communism by going to China. In practice I don’t think it works that way. I don’t think the motive for the work is signaling. I don’t think Nixon went to China to show how incredibly anti-communist he was. Nor did Tukey write EDA to show how incredibly high status he was.

Reply
LemmusLemmus says:

February 16, 2010 at 12:00 am

Seth,

I’m not buying the revised version of the Levitt’s-work-as-low-status view either. According to his CV, he got the John Bates Clark Medal in 2003. The papers that went into Freakonomics are (based on Wikipedia’s chapter overview, plus memory – i.e., I may have overlooked stuff):

Cheating teachers – 2002 , QJE, Brookings-Wharton Papers on Urban Affairs
Cheating sumo wrestlers – 2002, AER
Drug-selling gang’s finances – 2000, QJE
Abortion and crime – 2001, QJE
SES and names – 2004, QJE

The established economists that vote for the Bates Clark Medal clearly liked the “low-status” stuff that went into Freakonomics.

Reply
seth says:

February 16, 2010 at 12:00 am

LemmusLemmus, I’m not saying all economists think alike. Lots of people were glad Nixon went to China. Enough prominent statisticians liked Tukey’s emphasis on graphics that the whole area has become more popular. Nor am I saying that Levitt’s work was simply low-status. It was also well-done — just as Tukey’s work wasn’t merely low-status. He also introduced important new ways of making graphs. Disdain for what Levitt has studied has been publically expressed by Heckman, one of his colleagues. But I agree with you to this extent: Levitt had technical skills that made his work on low-status questions more acceptable to his profession. People are far more concerned about their own status than other people’s. Professor X, who would never study something low-status, might be quite happy that Levitt did so.

Reply
Alex Chernavsky says:

February 16, 2010 at 12:00 am

Seth, I’m glad you mentioned psychiatry. I may have posted this link in the past, but here is an excellent book about the psychiatric establishment and how it does more to harm patients than help them:

Mad in America: : Bad Science, Bad Medicine, and the Enduring Mistreatment of the Mentally Ill, by Robert Whitaker.

Whitaker (the author) is also coming out with a new book soon:

Anatomy of an Epidemic: Magic Bullets, Psychiatric Drugs, and the Astonishing Rise of Mental Illness in America.

It should be out in April. I’ve pre-ordered it. If it’s anything like the previous book, it should be excellent.

Reply
seth says:

February 16, 2010 at 12:00 am

Alex, thanks for the links. The new book sounds very promising.

Reply
vic says:

February 16, 2010 at 12:00 am

Also agree with everything, except the last paragraph. I don’t know if Levitt’s work is high status or low status, but one thing it’s not is economics, nor is it useful, correct, or insightful. It is cute though.

Reply
Patrik says:

February 17, 2010 at 12:00 am

Seth is 100% right about the paradox of Levitt. Levitt himself is high-status and that high-status allowed him to low-status type data exploration e.g. baby names, the economics of drug dealers etc etc

(I should mention before Levitt, Steven Landsberg and David Friedman were also writing economics books in a similar vein. However, I don’t believe they did extensive research into some of these low-status subjects, generally only doing theoritical exploration of these e.g. why does popcorn cost so much in movie theaters? etc etc

I know more than a few macro-economists who sniffed (jealously) at Levitt’s massive mainstream success, commenting on how “un-serious” and “unimportant” his work was.

Reply
1 says:

February 17, 2010 at 12:00 am

If psychiatry is so harmful, then what would be helpful for the mentally ill?

Reply
M says:

February 17, 2010 at 12:00 am

Turning again to the economics profession, a good example of high/low status problem is the gap between economists who work on policy (say in think tanks or government) and academics. Relatively simple analysis of data is essential input for policymakers and top decision-makers. And government is an important part of our economy. So, in this sense, policy economists do very useful work. (I’m not saying they are all good — just that they have an important role.) The majority of the person-hours expended by academic economists has nothing to do with improving policy analysis.

We also seem to have a system in the US (and other countries) where some of the very top economist positions in government are filled by those who first made their name as academics. In other words, they had to spend a long time demonstrating their high status to other academics (in not very useful ways) before getting the chance to employ relatively “simple” analysis in the public service.

Reply
seth says:

February 17, 2010 at 12:00 am

“What would be helpful for the mentally ill?” Helping people with a personal stake — they have the problem, or a loved one has the problem — do research. Helping them publish the results. Shift resources from those whose main goals are status and career advancement to those whose main goal is useful progress.

Reply
Michael Metcalf Bishop says:

February 17, 2010 at 12:00 am

Great posts by both you and Gelman!

Reply
Overcoming Bias : Function of Stat Academia says:

February 18, 2010 at 12:00 am

[…] See the flaw in that argument?Â Right – being useful to other academics in trying to impress each other isn’t at all the same as being useful to the wider world.Â Now consider a recent exchange between Seth Roberts and Andrew Gelman (who I debated on this topic in July.)Â Seth: […]

Reply
1 says:

February 18, 2010 at 12:00 am

Seth, I’ve never heard you talk about this before. This idea is big. I was wondering if you could go into more detail on your blog, if you have time.

Reply
1 says:

February 20, 2010 at 12:00 am

Yesterday, I spoke with an economics professor at Cal about this blog entry and he agreed with Seth’s view of Freakonomics.

Reply
seth says:

February 20, 2010 at 12:00 am

that’s interesting, l, what did the economics professor say?

Reply
1 says:

February 20, 2010 at 12:00 am

Seth, the economics professor said that he agrees that Freakonomics is low- status research. He said, “But I would use the word ‘popular [instead of low-status]‘ ” and “it’s not real economics research.”

I am in contact with a lot of professors at Cal everyday and I enjoy speaking with them about your blog entries!

Reply
1 says:

February 20, 2010 at 12:00 am

Seth, I just looked up the professor online, he is the chair of the department! I didn’t know that when I was talking to him, now I know.

Reply
seth says:

February 20, 2010 at 12:00 am

yes, to many professors popular = not good. For example, “pop psychology”.

Reply

Seth Robert's Blog Mirror

Personal Science, Self-Experimentation, Scientific Method

Exploratory Versus Confirmatory Data Analysis?

33 thoughts on “Exploratory Versus Confirmatory Data Analysis?”

Leave a Reply Cancel reply