On multiple choice tests (part I)

When I started teaching, in 1996, I never imagined that I would some day make use of multiple choice tests (MCT). To me, multiple choice would be a tool to conduct surveys, possibly suitable to test one’s proficiency with road signs and traffic rules, or other subjects requiring simple memorization — surely not a person’s mastery of a subject like physics.

Why ? Well, come on, isn’t it obvious ?
How would I ever be able to assess knowledge of basic concepts and their comprehension, logic, reasoning, computational methodology, if all the student has to do is check the right box, thereby picking one out of a few, pre-packaged answers ? And how would I know if that choice was arrived at by competence, or sheer luck ? How could I assign zero credit for accidentally checking the wrong box despite working out the problem correctly ?
Clearly, the correct way to test a physics student would be to have them write solutions to problems. That is what I thought back then.

As an assistant professor here, I started teaching large freshman physics sections, with enrolment in the hundreds and no help to grade papers. I had no choice but adopt MCTs, which I did begrudgingly. The past umpteen years of teaching, making use of MCTs, have radically changed my opinion on this subject. I do not just think that equally effective multiple-choice tests can replace problem-based exams, probably at all levels of instructions, I happen to believe that for specific courses, such as freshman physics, MCTs are far superior for a number of different reasons (grading easiness being only one of them). In fact, plausible and sensible as the above remarks on MCTs may seem at first, they are all flawed, and in any case problem-based exams do not address satisfactorily those shortcomings, and arguably suffer from even more serious ones [0].

I am dividing this post in two parts. In the first part I shall be discussing the supposed merits of problem-based tests, and their most important limitations. I am using physics as the specific subject, in order to keep the discussion concrete and because it is the one that I happen to teach. However, I do believe that many of the remarks apply quite generally, even outside science.

Problems and solutions
A physics problem-based test consists of students writing down solutions to assigned problems, each formulated as a question, in a given allotted time frame. What is tested is one’s ability to place the question in the right context, make use of the appropriate quantities and relevant equations, work out the algebra that is needed, and arrive at an answer, generally the numerical value of a physical quantity (expressed in the right units).
Well, so far so good, right ? How could anyone object to that ? The problem is that, while the principle is surely commendable, its actual implementation proves quite tricky, if one aims at devising a testing scheme that is effective, fair, and most importantly conducive to learning.

The difficulty with evaluating someone’s work is precisely that one is looking at a relatively complex solution, including wording as well as algebra, in different proportions depending on the individual student. No two solutions are identical, which in turn means that there is a significant degree of subjectiveness in the evaluation, even in a seemingly cut-and-dry field as physics.
For one thing, even though it does not aim at probing one’s writing skills, they have a serious impact on the test. Everything else being the same, students with an inferior writing ability are put at a disadvantage. Secondly, instructors looking at a student’s approach to the problem, will inevitably compare it to what constitutes in their mind the correct way, and often times grade accordingly. A student who may have obtained the right answer in the end, but whose procedure should be regarded as not fully convincing, or in the eye of the instructor as lacking important details (read: not identical to that which the instructor illustrated in class), will typically receive only partial, not full credit (e.g., 8 out of ten possible points for a single question).
In other words, the same test can be given different grades by different instructors, something which many of us regard as undesirable, as potentially unfair to students. But that is not the main drawback of problem-based testing, serious as it undoubtedly is.

The “partial credit” quandary
What is expected of a student is more elaborate than a mere numerical answer, and points may be taken off if the solution is deemed incomplete or partly inaccurate. By the same token, it is also broadly accepted that some partial credit ought be assigned in those cases where the student did not arrive at the final answer, but worked out a few intermediate steps. The rationale is that that shows at least some minimal knowledge or understanding of the subject. In other words, anything is deemed preferable to the student returning a blank sheet. Again, this seems hardly debatable.
Unfortunately, I am becoming convinced that subjectiveness in grading and partial credit [1] contribute to furnishing a perverse incentive to students, in turn promoting a shallow, content-free and ultimately flawed approach to science and learning on their part.

Anything will do…
What am I increasingly observing, specifically, that makes me think that way ?
We all have been students. We know how hard it is, especially the freshman year. Students are under tremendous pressure and punitive workloads. They register for five classes a term, have to complete weekly homework assignments for all of them, and prepare for five midterms and final exams at the same time. Trying to cut corners is human.
A significant fraction of students, these days, see partial credit as a viable strategy to pass, with the least amount of work, courses on which they do not plan on spending the time that they would really need. Out of every subject presented to them, they try and retain only the one aspect which, in their mind, will constitute the “meat” of a test problem, worth the bulk of the grade, and simply ignore the rest. Problem is, what is ignored is often times just as important [2].

Case in point: Units and dimensions
“I got the right value, why are you taking five points off ?”
“Sir, please note that the problem asks you for a length. In your answer, you are expressing it in kilograms…”
“Yeah but… five points ?”
The value of any physical quantity can only be meaningfully expressed in appropriate units, reflecting underlying physical dimensions. If I were to pick the single thing that a freshman student were to retain of a physics course, I would choose this. If students do not understand that a length cannot be expressed in kilograms, that force has dimensions of mass multiplied by length divided by time squared, that quantities with different dimensions cannot be added together, and that the first check of the correctness of the answer is dimensional consistency, I do not care how well they have memorized the rest, and how proficiently they can apply formulae and do algebra — they have not learned anything.
However, in a context where students are tested by means of problems, physical units and dimensions are often times simply ignored, the rationale being “Come on, how important can that crap be ? One point out of ten at the most…” [3].
This is just an example of what, in my opinion, is the worst drawback of problem-based tests. A problem aims at probing too much at once. Because partial credit must be assigned, major holes in one’s background can go undiagnosed.

As I shall try and convey in the second part of this long and boring post, MCTs allow one to design questions targeting specific aspects. Knowing that they could be tested on those aspects is a powerful incentive for students not to ignore them.

Notes

[0] Disclaimer: what is written here is only drawn from my own personal experience. I am sure that there exists a body of scholarly research on this subject, but this is just my blog, not a scholarly research journal.

[1] This subject was also discussed by Female Science Professor, both here and here.

[2] There are some (very few, in fairness) students who take the notion of partial credit to an extreme. The idea is simple: at test time, simply write stuff, attempt to give a deceiving impression of knowledge and reasoning by filling a page or two of essentially random text and formulae, with little or no regard to their relevance to the problem that they are supposed to be solving (this includes transcribing verbatim lines of text memorized off the textbook, or in case of open-book exam, copied on the spot) — then complain bitterly if that does not earn you at least 50% of the credit for that problem.
The solution to a physics test problem written by someone who has spent the time needed to study the course material, chewed on it, practiced his/her knowledge by working out exercises, is typically succinct, clear, precise, only includes relevant information, and conveys to the grader a sense of understanding and mastery of the subject.

[3] Yes, I know, I am the professor and can take off as many points as I like. That is not the issue. By the time it comes to that, it is too late. A good testing scheme should encourage students to study the right way.

Tags: , , , , ,

19 Responses to “On multiple choice tests (part I)”

  1. chall Says:

    I think the most important thing with multiple choice is what you want to test and how you write the questions… go figure.

    Furthermore, I am a huge fan of that “lenght in kilograms indicate you haven’t gotten it” and would happily deduct 5 points 😉 That’s what my physics and math teacher did in high school…. and chem professor was expressing that if we didn’t know if it was mol or Molar or ul thenwe didn’t really know anything, did we?!

    The only thing I still don’t know if I like if the people who subtract points (negative scoring) when you get something wrong on the MCT. Not sure if I favour a system where you can’t take a chance, at the same time; I guess you _should_ know?!?!

    • Massimo Says:

      The only thing I still don’t know if I like if the people who subtract points (negative scoring) when you get something wrong on the MCT

      I think it is rare, because one of the advantages of MCTs is that you can use scantrons and have papers graded by an optical reader — if one really wants to take points off for “bad” answers, one would have to reprogram the reader. I don’t see it done commonly.
      I am totally against it — wrong is wrong, no need to take points off.

  2. mareserinitatis Says:

    “Sir, please note that the problem asked you for a length. In your answer, you are expressing it in kilograms…”

    I was taking a class in computational physics that required the students to work in groups. We were supposed to do a simple program to calculate the position, velocity and acceleration of a projectile. I had to meet with my group late, but when I got there, I realized right away that all the equations they had derived were wrong simply because they had things like length equaling acceleration. The worst part was when I tried to correct them, and the only thing they did was get mad at me because “they had put a lot of work into deriving them”. 😛

    There is a professor in the EE at SU who does ‘binary grading’: you put your solution to a problem in the box, and the only thing he looks at is the box. He would probably be better off if he used scantrons, but I think he doesn’t want to give anyone an idea of what may or may not be a correct answer.

  3. Calvin Says:

    Hmm. As the semester is starting soon, I was just thinking of how bad my quantum mechanics (410) students were last year in simple dimensional analysis on integrals, and feeling I had to push them on the topic. MCT might be a good tool here….

    I’ve certainly had the student who filled up entire exam pages with lines and then were stunned when I gave them zero points. “But I studied all night for this!” they would exclaim. “There’s not a single thing right here,” I’d reply; “you didn’t even write down correctly that the sum of the torques must be zero.” (You can see how even 27 years later I remember this conversation exactly.)

    • Massimo Says:

      Yes, Calvin, I have been toying with the notion of using MCTs for QM as well, something which I would have regarded as heretic until a few years ago. The fact of the matter, however, is that there are very few problems in QM, even at the most elementary level, that can be solved exactly, and those are all worked out in class — plus they are too complicated to include on a test. On the other hand, important concepts like zero-point energy of confinement, using dimensional analysis to express penetration inside a barrier, dependency on statistics of cross-section, can easily be tested with MCTs.

  4. El Charro Says:

    A significant fraction of students, these days, see partial credit as a viable strategy to pass, with the least amount of work, courses on which they do not plan on spending the time that they would really need.

    Do you have data that shows there is a significant fraction of students who actually do that? Or are you inferring it from your own experience?

    • Massimo Says:

      That the whole post is based on my experience only, and that I am aware that there is actual research, is stated in note [0]. As for this specific point, no, of course I do not have data, I am not even sure how I would go about collecting it, given that I am expressing an opinion on students’ attitude. Anyway, I suppose I should have phrased it “… act as if they saw partial credit…”, but I kind of thought it was obvious.

  5. El Charro Says:

    Also,

    I prefer the problem-based tests (PBT) for my physics classes, for the classes that do not belong to physics a MCT is fine as far as I’m concerned.

    The reason why I like PBT is that, for some reason, the problems that I cannot solve correctly during the alloted time are the ones from which I end up learning the most. I think it is because the professor eventually publishes a detailed answer and I have to make sense of it and compare it to my work. For MCT, I usually just got an answer key, without really much explanation or detail on how to solve the problem. Plus, I always seem to lose my MCT work by the time the prof hands back the test and gives us the solution and can’t really compare what I wrote at the time I took the test with what should’ve been the answer.

    I’m not claiming these are good reasons to have PBTs over MCTs, just stating my preferences and why.

    • Massimo Says:

      But tests are not a learning opportunity, they are a measurement of one’s proficiency with a particular subject. They are a necessary evil, one that has to be accurate and fair, they are not a way to teach students something that one had an entire term for.
      Want it or not, grades have a tremendous impact on a student’s future, from determining their eligibility for more advanced courses, to scholarships, to admission into medical or graduate school. Exposing students to subtle or advanced concepts is fine during class, not at testing time.

      • mareserinitatis Says:

        So tests are only a means to get students to regurgitate that which they have already learned?

        I don’t know. I can recall a couple classes where I had take home exams which stepped me through the process of solving a familiar problem with a new method. I think the idea was to get us to look at problem in a new way, and we could still determine if we’d come out with the right answer because we’d covered the same problem before with a different method (unless, of course, you weren’t in class that day).

        I also think there are ways of giving partial credit fairly. If a problem requires several steps, then assigning a particular number of credits for each step seems reasonable. I do see your point that perhaps one should not be trying to test that much…but then I often wonder how realistic tests are, anyway. Real-life problems are often fairly complex, so being able to go through the process is as important as understanding the conceptual underpinnings. I think using MCT may hit the concepts and even some basic calculations, but I don’t think that it’s going to give you a good idea of who can really handle complicated (i.e. more realistic) problems.

      • Massimo Says:

        So tests are only a means to get students to regurgitate that which they have already learned?

        Well, you may use the word “regurgitate” to make it sound bad but… yes, I am sorry, that is exactly what a test is for — otherwise it is a lecture, not a test.
        If I had absolute certainty that every student has learned everything, why would I bother testing them ? What you are describing to me may be an effective teaching device, but here I am talking about the testing that results in a grade, with which a student is stuck for the rest of his/her life.

        I don’t think that it’s going to give you a good idea of who can really handle complicated (i.e. more realistic) problems.

        I am writing more about that in the second part of my post but I really don’t think that that is what testing is for, especially in introductory courses.

      • El Charro Says:

        But tests are not a learning opportunity, they are a measurement of one’s proficiency with a particular subject.

        I can’t say I agree 100% with you on that. From the professor’s point of view, sure, their purpose is to probe how much the student has learned. That needs to be assessed and a grade must be assigned. That doesn’t mean is the only use anyone can have for a test.

        For the student (at least that’s been the case for me and for many of my friends), tests are one of the great learning opportunities. You might think you know something, but you can’t be sure until you actually try to put the knowledge to the test (no pun intended).

        You could argue that HWs are precisely assigned for that reason but I don’t feel that way. Homeworks are solved in a completely different context. First, you can use whatever non-human resources you have available (books, lectures, notes, videos, etc.). Plus, you also have the human resources (professor, TAs, classmates, friends, etc.), and I don’t mean cheating, I mean discussing the problem with them which is generally allowed in (almost) any class.

        You could try to solve the HW the same way you take a test, but since HWs also count for points (in some cases maybe even too many points) as a student you try to get the most learning out of them but also making sure you get the most points from them too. Plus, usually there are ways to check if you answer is right or not before you turn the HW in. During a test you really have no choice, it’s you vs. the test and that is it.

        That’s also why I don’t like the way final exams are done (at least where I’ve studied). You take the test and you don’t ever come back to discuss the answers with anyone. It’s not that you don’t want to, it’s the regulations that say you can’t take the final exam with you, that it must be kept for the record; it’s that the course is over and the professor cannot spend too much time with you; it’s that they don’t want to publish the answers because that same final can be used next year; etc. etc. etc.

      • Massimo Says:

        That doesn’t mean is the only use anyone can have for a test.

        Never said that. It is the only use that I have for it.

        you can use whatever non-human resources you have available (books, lectures, notes, videos, etc.). Plus, you also have the human resources (professor, TAs, classmates, friends, etc.), and I don’t mean cheating, I mean discussing the problem

        … which is exactly the context into which one will be working for the rest of one’s life. Never will you be forcing yourself to tackle a problem by not making use of available resources, not discussing with others, not reading the literature etc. Your argument seems to be “A test forces me to do what I do not feel like doing on a homework assignment”… not sure I find this convincing.

  6. El Charro Says:

    Never said that. It is the only use that I have for it.

    That’s why I said at the beginning that PBT were my preference, and also clearly pointed out those weren’t necessarily the definite reasons to have PBTs over MCTs.

    which is exactly the context into which one will be working for the rest of one’s life. Never will you be forcing yourself to tackle a problem by not making use of available resources, not discussing with others, not reading the literature etc.

    I never said otherwise. My point was just that tests can be, if properly given, one more learning opportunity. One, that for me, proves to be an important one and of a different type that the one I have when solving HWs.

    Your argument seems to be “A test forces me to do what I do not feel like doing on a homework assignment”… not sure I find this convincing.

    Your argument seems to be “MCT make my life easier when grading plus help compensate for my inability to be fair across the board”… not sure I find either one of those convincing.

    If partial credit is that bad to fairly give out, you could just grow a pair and make it a rule that you do not give partial credit. I pointed out on my first comment that I was simply stating my preferences; you gave yours on your post. No need to question mine when yours can (and are) equally not convincing. At least, mine keep fulfilling a purpose (in the form of further learning opportunities) to some of the students.

  7. Toni Says:

    I see a tendency (in a different field than yours) that TAs are too generous in giving partial credit, basically giving more than half credit for essentially wrong answers. I just don’t see how this is an argument against rigorous grading.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: