Influential Statisticians

This article (“Ten statisticians and their impacts for psychologists”) impressed me. It’s a lot more accessible and basic than the usual academic article. However, my list — of the statisticians who’ve had the biggest effect on how I analyze data — is much different than his. From more to less influential:

1. John Tukey. From Exploratory Data Analysis I learned to plot my data and to transform it. A Berkeley statistics professor once told me this book wasn’t important!

2. John Chambers. Main person behind S. I use R (open-source S) all the time.

3. Ross Ihaka and Robert Gentleman. Originators of R. R is much better than S: Fewer bugs, more commands, better price.

4. William Cleveland. Inventor of loess (local regression). I use loess all the time to summarize scatterplots.

5. Ronald Fisher. I do ANOVAs.

6. William Gosset. I do t tests.

My data analysis is 90% graphs, 10% numerical summaries (e.g., means) and statistical tests (e.g., ANOVA). Whereas most statistics texts are about 1% graphs, 99% numerical summaries and statistical tests.

9 thoughts on “Influential Statisticians

  1. I agree with your list, Seth. Actually, because of you I bought a copy of Tukey’s EDA and am working through it one chapter at a time. I finished chapter 4 last night. On to Chapter 5 tonight. It’s great stuff!

    I’ve always been a fan of plotting data to understand it through visuals before even doing other summary stats and definitely before applying any inferential stats. You’re right that psychology (maybe other science disciplines as well?) have really lost touch with this and overemphasize inferential stats.

    I’d add a purely graphing book to your list: Edward Tufte’s The Visual Display of Quantitative Information.

  2. Thanks, Aaron, glad to hear it. I was told by an editor that Exploratory Data Analysis was published only so that the same company could publish another book (with Mosteller) that came out at the same time.

  3. I’ve used S-Plus and R for many years, but now work almost exclusively with R. I’ve filed many more bug reports for R than for S-Plus, so I find your suggestion that R has fewer bugs to be surprising. Also, from my own experience, almost every semi-annual release of R has major changes that break my existing code. S-Plus was much more stable.

    YMMV.

    Kevin

  4. BC Canada had a high school Probability & Statistics grade 12 course in the 80s. I used Tukey’s EDA and Tufte’s books with considerable success. At its peak the course attracted as many students (especially girls) as the standard pre-university Algebra 12 course. 20 years later we hired a former student as my vice-principal who says that course was the best in his high school career. Then came the qualitative literacy movement (good) and the BC Ministry of Education mysteriously dropped the course (bad) to minimal protest (sad). Now there’s piddling disconnected tidbits of P & S in the K-12 curriculum. Bring back the good old days!

  5. David, thanks for the recommendation, the book sounds really interesting. Since a couple people have mentioned Tufte, I will say that I have learned nothing from his books. I find them annoying, starting with such titles as “Envisioning Information”. Tufte is, however, a genius entrepreneur.

  6. Great post Seth. To add to the mix, in my own work, I would add Peter Bentler for his contributions to structural equation modeling and Reuben Baron’s and David Kenny’s papers on mediation vs moderation very insightful.

    As far as statisican-educators go, although thick and intimidating at first glance, Tabachnick and Fidell’s Using Multivariate Statistics is superbly written.

    I wonder if there is a stats/methodology book on self-experimentation?

Leave a Reply

Your email address will not be published. Required fields are marked *