To explain why Andrew Gelman et al.’s Red State Blue State Rich State Poor State is such an important book I have to tell two stories.
A few years ago a student did a senior thesis with me that consisted of measuring PMS symptoms day by day in several women. After she collected her data she went to the Psychology Department’s statistics consultant (a psychology grad student) to get help with the analysis. The most important thing to do with your data is graph it, I told the student. The statistics consultant didn’t know how to do this! There was little demand for it. Almost all the data analyses done in the Psychology Department were standard ANOVAs and t tests. If you look at statistics textbooks aimed at psychologists, you’ll see why: They say little or nothing about the importance of graphing your data. Gelman et al.’s book is full of informative graphs and will encourage any reader to plot their data. There are few examples of this sort of thing. That’s the obvious contribution. Because graphing data is so important and neglected, that’s a big contribution right there.
The other contribution is even more important, but more subtle. Recently I was chatting with a statistics professor whose applied area is finance. What do you think of behavioral economics? she asked. I said I didn’t like it. “It’s too obvious.” (More precisely, it’s too confirmatory.) For example, the conclusion that people are loss-averse — fine, I’m sure they are, but it’s too clear to be a great discovery. She mentioned prospect theory. Tversky and Kahneman’s work has had a big effect on economists — which certainly indicates it wasn’t obvious. Yes, it has been very influential, I said. I’m not saying their conclusions were completely obvious — just too obvious. Tversky and Kahneman were/are very smart men who had certain ideas about how the world worked. They did experiments that showed they were right. There’s value in such stuff, of course, but I prefer research that shows what I or the researcher never thought of.
Red State Blue State is an example. Andrew and his colleagues didn’t begin the research behind the book intending to show what turned out to be the main point (that the red state/blue state difference is due to an interaction — the effect of wealth on tendency to vote Republican varies from state to state). I suspect they got the idea simply by making good graphs, which is an important way to get new ideas. (Neglect of graphics and neglect of idea generation go together.) Red State Blue State could be used in any class on scientific method to illustrate the incredibly important point that you can get new ideas from your data. There aren’t many possible examples.
If I were teaching scientific method, I’d assign a few chapters of Red State Blue State and then have a class discussion about how to explain the results. Not just the state-by-wealth interaction but also the fact (revealed by a scatterplot) that the United States is far more religious than other rich countries — an outlier. Then I’d say: The graphs in the book made you think new thoughts. Your own graphs can do that.
Humans evolved to see pictures. (The real world is like pictures, only three-dimensional), they didn’t evolve to work with numbers, let alone advanced statistics. That’s why I think graphs are generally preferrable. The problem, of course, is that you can’t produce a seven-dimensional scatterplot, whereas you can regress a variable on seven other variables.
LemmusLemmus, check out the splom command in R. It produces scatterplot matrices: every variable plotted against every other variable.
I’ve never used R. (Maybe I should?) Frankly, I can’t imagine what plotting every variable against other variable would look like in a scatterplot. Anyway, thanks for the tip!