Interview with Andy Maul about Test Development (part 2)

4. You write “There are multiple problems with the validity of existing EI tests that make them difficult to interpret, and make claims based on them highly suspect.” What are the main problems?

Many early tests designed to measure emotional intelligence, and some still in use today (including one developed in part by the journalist Daniel Goleman, who popularized the term emotional intelligence in his 1995 book), used self-report methods and treated the construct as a conglomerate of various desirable personality and motivational factors, such as optimism, contentiousness, happiness, and friendliness. Tests of this nature may be interesting and may predict important outcomes, but emotional intelligence, defined in this way, is really just a repackaging of old ideas. These tests are so highly correlated with traditional measures of personality as to be operationally indistinct from them. Additionally, calling this construct emotional *intelligence* is suspect: personality and intelligence are generally regarded as very different things, and assessing intelligence
through self-report is generally considered inadequate.

The MEIS and the MSCEIT, to which I referred earlier, are, to my knowledge, the only two currently published tests that assess emotional intelligence as an intelligence. These tests ask respondents to engage in a variety of tasks, such as looking at pictures of people’s faces and reading stories about human interactions, and then make judgments about the emotional content of those stimuli. These tests are a step in the right direction, but have their own problems.

The test developers have a rather odd way to score the responses people give to the stimuli on the tests. The tests were administered to a large (N=2000+) standardization sample, and the scores people are now given on the items are the percentage of people from that standardization sample who chose that alternative. In other words, if you select choice “c” on an item, and 67% of people from the standardization sample also chose “c”, then you get a .67 for that item. If you chose “d”, and only 11% of
people chose “d”, then you only get .11 for that item. Your total score is simply the sum of the weighted scores from each item.

As odd as this method of scoring may sound, it has been used in other situations where the underlying theory is not well understood (see Legree, below, for an exposition of this). However, it presents difficulties here: it defines correctness as, essentially, conformity of opinion with the standardization sample. In other words, what is actually being measured may not be “intelligence”, but rather, simply, normality or popularity of opinion: the highest-scoring respondents will simply be those who most consistently choose the responses most other people also select. Additionally, this prohibits the existence of items so difficult that most people get them wrong, such as a very subtle facial expression that only the most emotionally astute could correctly
parse: if there were any such items, the astute minority would be penalized for choosing the less-popular but more-correct alternative. This is a serious challenge to the construct validity of the test.

Additionally, the internal structure of the test itself is suspect. The test developers posit a four-factor model of emotional intelligence (the four factors being the ability to perceive emotions, the ability to allow emotions to facilitate thought, understanding emotions, and managing emotions) and have branches of the tests designed to measure all four of those factors. They have published confirmatory factor analyses that they claim support their theory; however, re-analyses of their tests, and new analyses (including one that I am conducting now) have not been able to replicate their results, calling into question the internal validity and reliability of the tests.

In my dissertation, which I can make available early next spring, I discuss all these points in greater detail.

Part 1.

References

Legree, P. J., Psotka, J., Tremble, T., & Bourne, D. R. (2005). Using consensus based measurement to assess emotional intelligence. In R. Schulze & R. D. Roberts (Eds.), Emotional intelligence: An international handbook (pp. 155–179). Cambridge, MA: Hogrefe & Huber.

MacCann, C., Matthews, G., Zeidner, M, & Roberts, R. D. (2003). Psychological assessment of emotional intelligence: A review of self-report and performance-based testing. International Journal of Organizational Analysis, 11, 247-274.

Mayer, J.D., Salovey, P., & Caruso, D.R. (2004). Emotional Intelligence: Theory, Findings, and Implications. Psychological Inquiry, 3, 197-215.

Mayer, J.D., Salovey, P., Caruso, D.R., & Sitarenios, G. (2001). Emotional intelligence as a standard intelligence. Emotion, 1, 232-242.

McCrae, R.R. (2000). Emotional intelligence from the perspective of the five-factor model of personality. In R. Bar-On & J.D.A. Parker (Eds.), Handbook of Emotional Intelligence (pp.92-117). San Francisco, CA: Jossey-Bass.

O’Sullivan, M. (2005) Trolling for trout, trawling for tuna: The methodological morass in measuring emotional intelligence. In press.

Roberts, R., Schulze, R., O’Brien, K., Reid, J., MacCann, C., & Maul., A. (2006). Exploring the Validity of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) with Established Emotions Measures. Emotion, 6(4), 663-669.

Roberts, R. D., Zeidner, M., & Matthews, G. (2001). Does emotional intelligence meet traditional standards for an intelligence? Some new data and conclusions. Emotion, 1, 196-231.

One thought on “Interview with Andy Maul about Test Development (part 2)”

michael vassar says:

October 15, 2007 at 12:00 am

“Additionally, this prohibits the existence of items so difficult that most people get them wrong, such as a very subtle facial expression that only the most emotionally astute could correctly”

The solution to this is with recursive scoring. After the first round of allocating points to answers, take people’s total scores and then apply these as multipliers to the answers they select, then sum up the scores again. Repeat a few times if desired.

Seth Robert's Blog Mirror

Personal Science, Self-Experimentation, Scientific Method

Interview with Andy Maul about Test Development (part 2)

One thought on “Interview with Andy Maul about Test Development (part 2)”

Leave a Reply Cancel reply