Interview with Andy Maul about Test Development (part 3)

5. What are you doing to develop a better test?

Not being a content expert in either emotions or intelligence myself, I have no plans at the moment to create a test of emotional intelligence. Instead, my goal is to explore and discuss, firstly, better ways of engaging in the iterative process of construct exploration and test development, and secondly, better methods of test analysis. These are, of course, interrelated.

The “classical” method of test construction in psychology goes like this: a) decide what you want to measure (formally or informally); b) write items to measure it; c) pilot those items; d) run basic statistical analyses, such as Cronbach’s alpha e) remove the items from the test that are the least reliable with the other items, thus improving the reliability of the test, and f) publish.

This process usually yields a reasonably reliable test. A problem with this approach is that nowhere did we allow the
process of test construction to inform our theory development. Test construction can be as much a process of construct exploration as anything else, if we allow it. For instance, think-alouds and exit interviews can help us understand what subjects are actually thinking as they take the test, and whether the variation in the ways people approach the items truly reflects variation in the construct we think we’re measuring. The exercise of construct mapping can turn a murky idea of what we’re measuring into a much clearer one, by laying out a priori theories about what kinds of items measure what levels of the construct, and those ideas can then be empirically tested later, which makes the analysis phase much more informative than it traditionally is. And, of course, I would be remiss if I didn’t mention the analysis itself: item response modeling often affords valuable information missed by classical analysis, such as information on person and item “fit” (which, to be interpretable, requires going back to the theory of the link between the items and the construct itself, which once again benefits from thoughtful construct mapping) and information about dimensionality at the item level (as opposed to the branch level, which is where confirmatory factor models—such as the ones used to investigate the structure of the emotional intelligence tests I’m working with—traditionally concentrate).

Doing things in this manner usually takes more than one iteration, which is one reason people might not like it. So far, the MSCEIT has been developed and evaluated, and the test developers have spent a good deal of time debating other authors in the literature concerning the value of the test, but the analyses have not yet led to test revision (except in the manner I described above: that items with poor reliability were dropped, without any particular theory about *why* they were unreliable).

So, in other words: I won’t claim to be a substance expert enough to be able to write a new test of emotional intelligence on my own, but I would like to use the measurement efforts in this field as a way to discuss construct exploration and instrument development in psychological research.

Part 1. Part 2.

Reference

Wilson, M. (2005) Constructing Measures: An Item Response Modeling Approach. Lawrence Erlbaum Associates: Mahwah, New Jersey.

Leave a Reply

Your email address will not be published. Required fields are marked *