The Silent Spring of Marching Bands

Rachel Carson’s Silent Spring, about the damage done by pollution, “is widely credited with launching the environmental movement in the West” (Wikipedia). Along similar but narrower lines, last week’s USA Today had an article by Joyce Cohen about hearing damage caused by being in a marching band. It begins:

There’s no bigger booster of his marching band than Mark Claffey. “I am a total band nerd!” declares Claffey, a drummer for the Golden Falcons at Franklin Heights High School in Columbus, Ohio.

There’s just one downside. At age 17, he has painful ear damage.

He says that, after indoor rehearsals, his ears started hurting, then ringing.

Now, he’s abnormally sensitive to sound. If someone cranks the car radio, “I get a sharp shooting pain in my right ear,” says Claffey. . . .

It’s the dirty little secret of the halftime show: Marching band . . . can cause irreparable hearing damage, according to Brian Fligor, director of diagnostic audiology at Children’s Hospital in Boston.

The director of a professional group of music teachers claimed that knowledge of this problem is fairly new. That’s absurd, Joyce said. Stories about hearing problems among musicians have been published in medical and professional journals for at least two decades. Music teachers don’t acknowledge their own hearing problems, several experts told her, because doing so could endanger their livelihoods. Band parents, known for their fanaticism, were sometimes dismissive. They claimed that pain and ringing in the ears are normal.

The Indianapolis Star, published by Gannett, which also owns USA Today, reprinted the article. On the newspaper’s forums, readers started a debate about whether there should be laws to protect students’ hearing.

Memorial University Defends the Indefensible

In 1993, Marilyn Harvey, who at the time was Ranjit Chandra’s research assistant, came forward to say that a paper by Chandra reported research that didn’t happen. Memorial University conducted an investigation that failed to confirm her (very courageous) allegation. About that investigation, Ranjit Chandra’s Wikipedia entry says the following:

The vice-presidents were unable to secure the data, and, as a consequence, were unable to verify research fraud conclusively.

Huh? Harvey’s claim was that the data didn’t exist!

This sentence was written by Peter S. Morris, Director of Public Affairs at Memorial. I emailed Morris to try to find out how it could make sense. Presumably it made sense to Morris. Alas, Morris would not explain it. He did say that to prove research fraud — in Chandra’s case, the fraud of making up data — you need the data. You read that correctly: To prove that someone has made up data you need to have the data, Morris asserted. He wouldn’t explain that, either.

Memorial’s behavior did great harm to Marilyn Harvey, as you can read in the complaint filed with her lawsuit.

How Interesting is Good Calories, Bad Calories?

Very.

The single most striking result in the history of the cholesterol controversy . . . passed without comment by the authorities: those Framingham residents whose cholesterol declined over the first fourteen years of observation were more likely to die prematurely than those whose cholesterol remained the same or increased. They died of cardiovascular disease more frequently as well.

Around 1990, nineteen studies found that both women and men had higher total mortality at the lowest cholesterol levels (< 160). The increase came from more “cancer, respiratory and digestive diseases, and trauma.” From Gary Taubes’ fascinating new book Good Calories, Bad Calories.

I expect these results are corrected for income but I’m not sure. A friend of mine is very poor. “You have the cholesterol level of a Chinese peasant [i.e., very low],” his doctor once told him.

Interview with Taubes (October 9).

Exit Wounds

Tonight, by accident, I attended a talk by the Israeli artist Rutu Modan about her graphic novel Exit Wounds. I learned:

1. One day in 1914, Franz Kafka wrote in his diary, “Germany invaded Russia. Swimming lessons in the afternoon.”

2. Browsing at a flea market, she found an album of pictures of her dead father. Her family had given it away by mistake. When she told the seller about this, he raised the price from $4 to $150.

3. She wrote Exit Wounds in Hebrew but drew it in English. No kidding. She wrote the text in Hebrew and had it translated into English. The balloons where the words go were arranged to read from left to right. For a forthcoming Hebrew edition, she made mirror images of everything. There was just one problem: Cars were on the wrong side of the road. 150 panels (the hero is a taxi driver) have this problem. She has been forced to do some redrawing.

Interview with Andy Maul about Test Development (part 3)

5. What are you doing to develop a better test?

Not being a content expert in either emotions or intelligence myself, I have no plans at the moment to create a test of emotional intelligence. Instead, my goal is to explore and discuss, firstly, better ways of engaging in the iterative process of construct exploration and test development, and secondly, better methods of test analysis. These are, of course, interrelated.

The “classical” method of test construction in psychology goes like this: a) decide what you want to measure (formally or informally); b) write items to measure it; c) pilot those items; d) run basic statistical analyses, such as Cronbach’s alpha e) remove the items from the test that are the least reliable with the other items, thus improving the reliability of the test, and f) publish.

This process usually yields a reasonably reliable test. A problem with this approach is that nowhere did we allow the
process of test construction to inform our theory development. Test construction can be as much a process of construct exploration as anything else, if we allow it. For instance, think-alouds and exit interviews can help us understand what subjects are actually thinking as they take the test, and whether the variation in the ways people approach the items truly reflects variation in the construct we think we’re measuring. The exercise of construct mapping can turn a murky idea of what we’re measuring into a much clearer one, by laying out a priori theories about what kinds of items measure what levels of the construct, and those ideas can then be empirically tested later, which makes the analysis phase much more informative than it traditionally is. And, of course, I would be remiss if I didn’t mention the analysis itself: item response modeling often affords valuable information missed by classical analysis, such as information on person and item “fit” (which, to be interpretable, requires going back to the theory of the link between the items and the construct itself, which once again benefits from thoughtful construct mapping) and information about dimensionality at the item level (as opposed to the branch level, which is where confirmatory factor models—such as the ones used to investigate the structure of the emotional intelligence tests I’m working with—traditionally concentrate).

Doing things in this manner usually takes more than one iteration, which is one reason people might not like it. So far, the MSCEIT has been developed and evaluated, and the test developers have spent a good deal of time debating other authors in the literature concerning the value of the test, but the analyses have not yet led to test revision (except in the manner I described above: that items with poor reliability were dropped, without any particular theory about *why* they were unreliable).

So, in other words: I won’t claim to be a substance expert enough to be able to write a new test of emotional intelligence on my own, but I would like to use the measurement efforts in this field as a way to discuss construct exploration and instrument development in psychological research.

Part 1. Part 2.

Reference

Wilson, M. (2005) Constructing Measures: An Item Response Modeling Approach. Lawrence Erlbaum Associates: Mahwah, New Jersey.

The Apparent Spread of SLD

The number of visitors to the Shangri-La Diet forums has been growing. This graph shows, for each day, the maximum number of people accessing the forums at one time. (When you load a page, I guess you are considered “at” the forums for some length of time.)

most online by day

“Most online” has steadily increased since January. These values are closely correlated with the number of visitors in a day, for which I have less data.

Here is another way to look at the most-online data. Each most-online value is divided by the value from one week earlier.

rate of change of most online

Perhaps the rate of increase is increasing but it isn’t clear.

Will it live?

Marilyn Harvey’s Complaint

In 1993, Marilyn Harvey, a nurse, complained to Memorial University that her boss, world-famous Order-of-Canada scientist Dr. Ranjit Chandra, could not have done the research he claimed. A very courageous thing to do. After an investigation, Memorial did not agree. Harvey recently sued Memorial. From her complaint:

The Plaintiff [Harvey] says that the Defendant [Memorial] defamed her by taking actions which . . . caused her to be isolated, shunned, and humiliated through the following:
(a) representing to the community that her complaint was unjustified;
(b) misconducting the investigation of the complaint;
(c) misleading the research community as to the reasons for discontinuing the investigation;
(d) choosing not to conduct another investigation;
(e) misleading the Plaintiff as to the reasons for discontinuing the investigation;
(f) acquiescing in and adopting [?] the actions of Dr. Chandra when he sued her for theft of research data; and by its conduct giving the Plaintiff and the public the impression that it believed the allegations of theft to be true;
(g) treating the Plaintiff in a manner as to imply to her and the university and the healthcare communities, and the public, that her complaint was unjustified;
(h) acquiescing in and adopting statements of Dr. Chandra which impugned the Plaintiff’s motives and integrity.
The overall effect of the conduct of the Defendant was to constitute a communication to the community, and to the research and hospital community in particular, that was profoundly defamatory. . . It expose[d] her to contempt, ridicule and marginalization and [caused her] to be viewed by co-workers as a troublemaker and a pariah who could have a detrimental effect on one’s career if she were not avoided.

Memorial’s defense.

Interview with Andy Maul about Test Development (part 2)

4. You write “There are multiple problems with the validity of existing EI tests that make them difficult to interpret, and make claims based on them highly suspect.” What are the main problems?

Many early tests designed to measure emotional intelligence, and some still in use today (including one developed in part by the journalist Daniel Goleman, who popularized the term emotional intelligence in his 1995 book), used self-report methods and treated the construct as a conglomerate of various desirable personality and motivational factors, such as optimism, contentiousness, happiness, and friendliness. Tests of this nature may be interesting and may predict important outcomes, but emotional intelligence, defined in this way, is really just a repackaging of old ideas. These tests are so highly correlated with traditional measures of personality as to be operationally indistinct from them. Additionally, calling this construct emotional *intelligence* is suspect: personality and intelligence are generally regarded as very different things, and assessing intelligence
through self-report is generally considered inadequate.

The MEIS and the MSCEIT, to which I referred earlier, are, to my knowledge, the only two currently published tests that assess emotional intelligence as an intelligence. These tests ask respondents to engage in a variety of tasks, such as looking at pictures of people’s faces and reading stories about human interactions, and then make judgments about the emotional content of those stimuli. These tests are a step in the right direction, but have their own problems.

The test developers have a rather odd way to score the responses people give to the stimuli on the tests. The tests were administered to a large (N=2000+) standardization sample, and the scores people are now given on the items are the percentage of people from that standardization sample who chose that alternative. In other words, if you select choice “c” on an item, and 67% of people from the standardization sample also chose “c”, then you get a .67 for that item. If you chose “d”, and only 11% of
people chose “d”, then you only get .11 for that item. Your total score is simply the sum of the weighted scores from each item.

As odd as this method of scoring may sound, it has been used in other situations where the underlying theory is not well understood (see Legree, below, for an exposition of this). However, it presents difficulties here: it defines correctness as, essentially, conformity of opinion with the standardization sample. In other words, what is actually being measured may not be “intelligence”, but rather, simply, normality or popularity of opinion: the highest-scoring respondents will simply be those who most consistently choose the responses most other people also select. Additionally, this prohibits the existence of items so difficult that most people get them wrong, such as a very subtle facial expression that only the most emotionally astute could correctly
parse: if there were any such items, the astute minority would be penalized for choosing the less-popular but more-correct alternative. This is a serious challenge to the construct validity of the test.

Additionally, the internal structure of the test itself is suspect. The test developers posit a four-factor model of emotional intelligence (the four factors being the ability to perceive emotions, the ability to allow emotions to facilitate thought, understanding emotions, and managing emotions) and have branches of the tests designed to measure all four of those factors. They have published confirmatory factor analyses that they claim support their theory; however, re-analyses of their tests, and new analyses (including one that I am conducting now) have not been able to replicate their results, calling into question the internal validity and reliability of the tests.

In my dissertation, which I can make available early next spring, I discuss all these points in greater detail.

Part 1.

References

References

Legree, P. J., Psotka, J., Tremble, T., & Bourne, D. R. (2005). Using consensus based measurement to assess emotional intelligence. In R. Schulze & R. D. Roberts (Eds.), Emotional intelligence: An international handbook (pp. 155–179). Cambridge, MA: Hogrefe & Huber.

MacCann, C., Matthews, G., Zeidner, M, & Roberts, R. D. (2003). Psychological assessment of emotional intelligence: A review of self-report and performance-based testing. International Journal of Organizational Analysis, 11, 247-274.

Mayer, J.D., Salovey, P., & Caruso, D.R. (2004). Emotional Intelligence: Theory, Findings, and Implications. Psychological Inquiry, 3, 197-215.

Mayer, J.D., Salovey, P., Caruso, D.R., & Sitarenios, G. (2001). Emotional intelligence as a standard intelligence. Emotion, 1, 232-242.

McCrae, R.R. (2000). Emotional intelligence from the perspective of the five-factor model of personality. In R. Bar-On & J.D.A. Parker (Eds.), Handbook of Emotional Intelligence (pp.92-117). San Francisco, CA: Jossey-Bass.

O’Sullivan, M. (2005) Trolling for trout, trawling for tuna: The methodological morass in measuring emotional intelligence. In press.

Roberts, R., Schulze, R., O’Brien, K., Reid, J., MacCann, C., & Maul., A. (2006). Exploring the Validity of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) with Established Emotions Measures. Emotion, 6(4), 663-669.

Roberts, R. D., Zeidner, M., & Matthews, G. (2001). Does emotional intelligence meet traditional standards for an intelligence? Some new data and conclusions. Emotion, 1, 196-231.

Omega-3 and Plagiarism

The news page of Linkoping University, in Sweden, has two articles that greatly interest me. One is about a surprising effect of omega-3 supplements:

One-year-olds whose mothers had ingested fish oil during pregnancy and breastfeeding had considerably fewer allergic reactions than children whose mothers did not take this supplement.

The other is about a case of extreme plagiarism: An entire material-science paper was copied, almost word for word, from PNAS. Into Madness has a nice comment:

Regarding the main authors, there seems to be a Nepali element involved! Sounds like a case for Father Brown. . . . Some Engineering students at Anna University [where two of the four authors of the paper that is a copy came from] who I talked to were not aware of this until they read the blogs. There have been no newspaper reports in India (as far as I know). How and when Anna University will react to this incident will be interesting to watch.

I agree. In the 1990s, when (a) Ranjit Chandra’s research assistant came forward and said “this research couldn’t have been done” and (b) Chandra could not produce the data, it was obvious that something was seriously wrong. Yet Memorial University, Chandra’s employer, gave Chandra a tap on the wrist.

A curious feature of this case is that two co-authors claim they are innocent:

Tom Mathews, doctor at the Indira Gandhi center for nuclear research in India and one of the four researchers named as authors, distances himself from the article in an email to DN [= Swedish newspaper]. So does Roshan Bokalawela, graduate student at the University of Oklahoma in the USA.

Interview with Andy Maul about Test Development (part 1)

Andy Maul, who took introductory psychology with me, is a graduate student in Educational Psychology at UC Berkeley.

1. What is your research about?

I’m taking a closer look at tests recently developed to measure the construct of emotional intelligence (EI). In particular, I’m looking at the Multifactor Emotional Intelligence Scale (MEIS) and the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT), which were both developed in the past decade and evaluated using traditional methods (confirmatory factor analysis [CFA] and classical test statistics such as alpha coefficients, along with correlations with other tests and hypothesized outcomes). I’m looking at these tests again, both though the traditional lens of CFA and through the newer lens of Item Response Theory (IRT). In the end, I hope to make points both for the development of EI tests, and for psychological measurement in general, by highlighting how newer methods can improve the construct- and test-building process.

2. How did you get interested in this line of research?

I became interested in emotions by working with Professor Dacher Keltner. At some point in graduate school my interests shifted to the more quantitative side of research, and I’ve since been working with Professor Mark Wilson on test theory and statistical measurement. I thought combining the two interests, by evaluating tests of emotional intelligence through a quantitative lens, would be a good idea.

3. What’s an example of research that shows the value of measuring emotional intelligence?

The MSCEIT appears to predict some life outcomes (such as grades, prosocial behavior, and self-reported life satisfaction), even controlling for IQ and personality. Other researchers have challenged these claims as being premature and based on insufficient evidence. There are multiple problems with the validity of existing EI tests that make them difficult to interpret, and make claims based on them highly suspect.

Some researchers feel that defining and measuring emotional intelligence could clarify and expand our definitions of intelligence and cognitive abilities in general, and provide information about an area of human functioning that could predict important personal and interpersonal outcomes (such as life satisfaction and the quality of one’s relationships) above and beyond traditionally-measured intelligence and personality. In today’s era of high-stakes testing, with so much riding on what many feel to be tests with limited utility, a new, well-validated test of emotional intelligence could provide insight into what makes students successful in schools and in life.

References

Mayer, J., Salovey, P., & Caruso, D. (2002). Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT): User’s manual. Toronto, Canada: Multi-Health Systems.

Mayer, J., Salovey, P., Caruso, D., & Sitatenios, G. (2003). Measuring emotional intelligence with the MSCEIT V2.0. Emotion, 3, 97-105.

O’Sullivan, M. (2005) Trolling for trout, trawling for tuna: The methodological morass in measuring emotional intelligence. In press.

Palmer, B., Gignac, G., Manocha, R., & Stough, C. (2005). A psychometric evaluation of the Mayer-Salovey-Caruso Emotional Intelligence Test Version 2.0. Intelligence, 33, 285-305.

Roberts, R. D., Schulze, R., Zeidner, M., & Matthews, G. (2005). Understanding, measuring, and applying emotional intelligence: What have we learned? What have we missed? In R. Schulze & R. D. Roberts (Eds.), Emotional intelligence: An international handbook (311—341). Cambridge, MA: Hogrefe & Huber.