On multiple choice tests (part II)

Having described in my previous post the most important deficiencies of problem-based tests (as I see them), I am now going to list what I perceive as the most important merits of Multiple Choice Tests (MCTs), and illustrate why I regard them as a better choice, especially for introductory, foundation type courses. Here too, in order to keep the discussion concrete I shall focus on physics tests.

Imagine the following scene:
Venue: Universita’ degli Studi di Genova, Italy, 1982. Packed in a small room, in the sweltering July heat, a bunch of first-year physics students, most of them teenagers, having written a physics exam whose outcome was nothing short of a disaster (about fifty sat for it and maybe five were admitted to the oral portion — quite a normal occurrence in those days), listen mortified, as the professor sternly berates them, his teaching assistant nodding along, bowing his head toward him, almost in adoration. I am one of those students, looking at my own paper — 25% is my score.
“The thing that bothers me the most”, pontificates the professor, “is your helplessness. As soon as I assign a problem that is more difficult than those given as homework, you all collapse !”.
From the back of the room, comes a question. A student asks: “Excuse me, Professor, what is the point of assigning on a test a problem that is more difficult than those given as homework ?”.
I cringe. How can anyone ask a question at such a dramatic juncture, never mind one so insolent in its naivete… “Do, you, Sir, believe that everyone is entitled to a physics degree ?”, is the reply. “Has no one ever explained to you that a physicist should display extraordinary abilities, and that if you lack them, maybe this is not your calling ? That maybe the humanities are better suited for you ? Think about it, Sir ! Think about this fact, all of you !”.

Oh, well… that student asked for that answer… right ?
Um, no. His question was bang on. The Professor was arbitrarily, unreasonably failing so many students largely because he wanted to be at the beach, rather than at school administering oral exams. I am appalled at the thought of how many potentially good physicists were back then turned away by lazy, pompous, arrogant fools like that professor.

What are exams and tests for, anyway ?
It may be in order to remind ourselves what the aim of testing students is. As I wrote in my response to one of the comments to my previous post, I regard testing as a necessary evil. Its purpose is that of providing society with an official assessment of the proficiency and mastery of a given subject on the part of a student. Simple as that. That is what society (e.g., high schools, colleges, government, private and public sector employers, fellow researchers) expects of someone like me.
I am charged with expressing such an assessment in the form of a numeric/letter grade, based upon which important decisions affecting the future of each students will be made. I thus see a test as a measurement, one that should be as much as possible unbiased, objective and reproducible.
I am not asked to evaluate someone’s IQ, deductive reasoning, creativity, communicative skills, quickness, intuition, nor any other “extraordinary ability” — I am not qualified for that anyway (not sure who would be) [0].
I try and design my tests and exams so that any student who
1) has memorized basic definitions (e.g., physical units) [1]
2) has assimilated fundamental concepts, by reinforcing them through methodical study and practice
3) can apply them to solve simple problems
will do well. I do not see what else an instructor should measure by testing, especially in courses like freshman physics, which, to most students, are only propaedeutic — very few students will actually major in physics.
Those students who can go beyond that, who possess superior analytical skills, I am talking deep thinkers with uncommon creativity or talent for science, will have plenty of opportunities to showcase their extraordinary abilities later on, through research rather than course work.

Why MCTs ? Here’s why:
This is what a well-designed MCT will enable me, the instructor, to do:
1) By means of targeted questions, focusing on just one specific concept, directly assess a student’s knowledge of the basics. Student will not ignore units, or dimensions, if a question or two about them can appear on a test.
2) By carefully drafting the different possible answers, give a chance to students who may not remember the full procedure, or who under pressure may be prone to making algebraic mistakes, to utilize orders-of-magnitude considerations, as well as simple physical understanding to arrive to the correct answer by exclusion [2].
3) Promote a concrete, goal-oriented, individual approach to problem-solving in science, emphasizing the ability to arrive at the correct answer (possibly in an unconventional way), rather than strict adherence to a given procedure or formalism.
4) Evaluate students in a relatively straightforward and objective way, without having my judgment affected by my (dis)like of their writing skills, methodology, reasoning, and without having to make difficult judgment calls when it comes to taking points off due to “imperfections” in the solution.

What are the main objections to MCTs ?
Perhaps commenters will point out the most serious drawbacks of MCTs. These are some of the most commonly heard:
A student will receive no partial credit. This is actually a plus, in my opinion. There is no objective, consistent way of assigning partial credit without making a judgment call. Ensuring fairness to all students (including those who enrol in a different section of the same course, with another instructor) is impossible. Plus, giving partial credit provides wrong incentives to students, as I tried and explain in my previous post. Rather than lack of partial credit, students should focus on the fact that all they have to do is pick the right box.
True, a student will receive no credit if, by mistake, (s)he checks the wrong box after working out the correct answer. However, I think that that is sufficiently infrequent an occurrence not to regard this as a major problem, and it is also compensated by more frequent lucky guesses.
Students can pick the right answer by luck. This is a weak one. Sure, one answer can be picked by luck, maybe even two or three, but the likelihood of selecting by luck enough right answers to earn a passing grade is pretty small.
Sometimes two answers will only differ by a small amount, and one will miss a point for calculator round-off error. MCTs can be well or poorly designed, like any other test. Including two answers that “only differ by a small amount” is bad testing procedure.

[0] I might be asked to comment upon some of that on a letter of recommendation, where I may, at best, offer an opinion, which one may regard as more or less reliable. However, a test score should be considered akin to the outcome of a measurement, which in principle others should be able to reproduce (within the obvious limitations coming from the fact that measurements on humans are far more complex than those on the objects with which physics deals).

[1] I do not care how smart you are. Just like you cannot speak a foreign language without memorizing the verbs, you cannot do physics if you do not memorize some basic notions like units, or definitions like work, kinetic or potential energy, electric field, flux, circulation etc., as well as experimental laws like Coulomb’s. Anyone who says the contrary is either a con man or does not know what (s)he’s talking about. Not that I have strong feelings about this — not at all.

[2] This is one of my favorite questions when I teach introductory electricity and magnetism:
Two point charges of equal sign, worth in magnitude Q and 9Q, are located at a distance L from one another. Assuming that they are held at fixed positions, where should a third point charge q be placed, between the two charges, along the line that connects them, in order for it to be in equilibrium ?
1. It depends on the sign of q
2. Right in the middle, at a distance L/2 from either
3. There is no equilibrium position anywhere between the two charges
4. At a distance L/4 from the charge Q
5. At a distance L/3 from the charge 9Q
A student who has learned the basics can pick the right answer without doing a single calculation.

Tags: , , , , ,

24 Responses to “On multiple choice tests (part II)”

  1. Devin Baillie Says:

    In highschool they gave us answers which differed by a a very small amount to test us on proper rounding/sig figs.

    By university I would think that everyone should know not to round until the final answer, and anyone tripping up on such questions deserves to lose the mark anyways.

    My biggest problem with MCTs is that I often forget my calculator. At least with problem based exams, I can do everything symbolically in such a case, and leave the answer in terms of the given quantities. I have always gotten full credit for such answers, where in MCTs, I wouldn’t get anything.

  2. mareserinitatis Says:

    If those are the types of questions you put on your multiple choice, I guess I’m okay with it. Even I could pass that. 😉

    • mareserinitatis Says:

      On the other hand, in thinking about it, I would probably get rather confused. I’m assuming you meant for the answer to be 5.

      I would first make the assumption that the sign would have to be negative or there is no way to put the system into equilibrium…all charges would repel. However, the answer 5 only puts the system into equilibrium when q=-Q. If you figure it out mathematically, the answer is actually (L/3)*(q/Q), which means that without specifying the magnitude of charge q, it would be a poorly worded question as there is a continuum of answers dependent on the magnitude of the third charge. Answer 2 would work if q = (9/2)Q. Likewise, 4 could also be a solution with the correct magnitude of q.

      Or do I deserve a fail in freshman physics?

      • mareserinitatis Says:

        Sorry…that should be sqrt(q/Q)…

      • Massimo Says:

        mmmmm…. OK, here is how I see it: depending on its sign, the charge is either going to be repelled by, or attracted to both other charges. However, since it is placed between the two, the two forces will be competing with one another, and for a specific location they will exactly cancel — hence the charge will be in equilibrium. If you did not place the charge q between the other two, it would be either attracted to or repelled by both, and in neither case could the resulting force be zero.
        Now, it has to be closer to the charge which is smaller in magnitude, for only then can the attraction/repulsion it experiences from said charge, can compensate that which it experiences from the other charge. Thus, the only possible answer is 4.
        In order to check, let’s work out the algebra:

        Qq/x2 = 9Qq/(L-x)2

        solving for x yields L-x=3x, i.e, x=L/4.

      • mareserinitatis Says:

        Maybe I didn’t fail freshman physics, but perhaps I should have failed algebra… 😛

      • Massimo Says:

        I think I can give you some partial credit… 😀

      • mareserinitatis Says:

        Could you just give me memory back, instead? That would make me much happier.

  3. Schlupp Says:

    Strong feelings, uh? Then what about the following gem I was informed of recently: A professor decides to replace the written exams commonly done at his university by oral ones, because
    1) He can in this way find out how intelligent a student is.
    2) Intelligence is what a physics exam should be about.
    3) The intelligence exam needs to be oral, because he can judge a student’s intelligence from certain aspects of how said student moves parts of his face while answering.

  4. prodigal academic Says:

    Thanks for doing this series. My course enrollment doubled this year, and I am not getting additional TA help, so that leaves me grading 240 exams on my own. Last year I did 50% MC/50% problems, and I was feeling guilty about moving from that to 85% MC/15% (1 graded) problem.

    I’ve been astounded that I can use practice exam questions, problems worked out in class, and assigned HW for my exam questions and still get a distribution of grades centered around a 70-75%. Before I became a prof, I would have said that I need to use new questions or the class would do too well! Any additional advice for good MCT design?

  5. GMP Says:

    Nice posts, Massimo. I hate partial credit, I must say — if you know your stuff, you should be able to do it from beginning to end. If not, tough toenails. Honestly, getting used to partial credit makes students lazy: I have had some very good students as research interns this summer, and you can’t get them to finish the project. Stu: “Well, I know how to do it in principle, do I actually have to finish the calculation? ”
    Me: “Yes, you do. I want to see those calculations and figures.” (And when they cannot actually produce them proves they don’t really understand the materal as well as they think they do). I am a big believer in all or nothing when it comes to grading; but, I don’t implement it as much as I would like to in my classes as all hell would break loose without the beloved partial credit…

    • Massimo Says:

      Sure, same here. Countless times.
      Student: It seems like I got most of it right, why only 2 points out of ten ?
      Me: “Most of it” ? All you have written here is F=ma… nothing else…
      Student: Well, yeah, but that is most of it, the rest is just algebra !

      • GMP Says:

        Student: … the rest is just algebra!

        AHAHAHAHAHAHA! If I had a penny for every time I heard this…

      • Schlupp Says:

        Oh, I see a great future in theoretical physics for this student, already has the right attitude. It’s only small minded people like you who can keep such students away from the fame they so richly deserve.

  6. transientreporter Says:

    Nice pair of posts, Maestro…

    I’m pretty much in agreement – MCT’s are more versatile than one might initially think, and can be used to ask many different kinds of questions. However, it does raise the issue of problem-bias – certain subjects are more easily amenable to MCT’s than others, and hence get on exams more. So, you get a skewered view of student comprehension.

    Also, some mundane questions for you: do you recycle questions for subsequent semesters? Do you tweak values, initial conditions etc? As you know, students (e.g. frat houses) have been known to stockpile exam questions. Or do you not return the exams to students – just the grades (a dumb practice, in my opinion, because they can’t see what they got wrong)?

    A part of me says that recycling is not an issue – if a student has managed to rote-memorize the answers to 100 questions, they should pass the exam for effort alone (and it’s unlikely they would be able to do this without some conceptual knowledge getting through. A part of me says that at the very least, I should shuffle the answers around. In that regard, is there a decent software program out there that you can use to create your own list of questions, and that can spit out tests with the possible answers randomly shuffled?

    • Massimo Says:

      Hello Sir, good to see you back.

      certain subjects are more easily amenable to MCT’s than others, and hence get on exams more

      But I find that the same is true for PBTs as well. Case in point, again, units. I cannot assign a problem just on units, they can at best be part of a question and as such often neglected.

      Do you recycle questions for subsequent semesters? Do you tweak values, initial conditions etc?

      No, thus far I have not. I find it reasonably easy to make brand new questions each time, especially for introductory physics. I do not return exams to students (have you ever heard of the “erase-check right box-go to instructor-claim scantron error” approach ?), but I do post an answer key and go through the problems in class.

      • transientreporter Says:

        But I find that the same is true for PBTs as well
        Hmm yes… doesn’t this argue for an exam that’s a mix of MCT and PBT, depending on the material covered? You’re the prof, you can make up the exam any way you want. You’re not the SAT or GRE administering board… My exams tend to be about 80% MCT’s and 20% PB/short answers.

        I return scantrons. Occasionally I have had students who’ve missed a row, and staggered all subsequent answers by one question.

  7. Devin Baillie Says:

    I think as long as you plan in advance what material they want to cover on an exam, it shouldn’t be too much trouble to make sure you don’t make too many questions on the easy-to-make-questions-for material. Just make a list in advance of the material you want to cover, and what you think its relative weight should be, and then make sure to divide up the exam based on that.

  8. JT Says:

    On the fundamental principles, we are in agreement: multiple choice testing can be as good or better than other testing approaches. I am not sure how anyone can disagree. If given sufficient freedom wrt sufficient details, “x can be as good as y”. In my six full years of (grad+undergrad) university classes and twelve full years of elementary+secondary education, I have even seen “good MC” (i.e. a difficult test for the unprepared). Twice.

    My personal experience with its implementation is that MC testing is a joke. In high school, if I knew that the test would be multiple choice, I refused to even study unless I cared about learning the subject for my own interests – it was enough to attend the classes (mandatory anyway) and rub a few brain cells together in order to get a decent mark (80%+).

    How? Out of the five choices, one was almost always disposable by anyone with a pulse, and another two could be eliminated without an iota of understanding of the given subject, but an ability to THINK. That left a 50/50 shot. Superficial understanding of the subject matter would help you eliminate 2/3 of the remaining incorrect answers. Guess on the rest, and you were likely to reach into the low 80’s, sometimes 90’s, sometimes 70’s.

    Think it made me happy? Hell no, it pissed me off. If I poured my heart and soul into a class and got a 92% or 95% or such, while the others jerked around and many got a mark in the 80-90 range, the premium for effort was simply not large enough to be worth it (until the marks counted toward university acceptance/scholarships, in final year). At the time, I interpreted MC as a mechanism allowing teachers to employ false objectivity in the passing of shit uphill. My view is far more tempered now – there are a variety of good reasons to employ MC (necessity due to class size and concurrent sessions foremost among them).

    In freshman year, I applied the same standard to the courses I cared little about: attend class, study nada or close to it. Result? 75% on the first unit test in one class (never opened my textbook, never read my notes after class), 88% in another midterm (I looked at the figures and captions in the relevant chapters). Even after dropping class attendance entirely, I was able to get a shameful but shockingly high 60% on a plant biology MC test that in places felt like it was written in a language only distantly related to English. Even a simple short answer test would have been a single-digit slaughter (OK, obviously one could ask students to label the colour of pictured plants, or such, and obtain a metric of equal inutility).

    My two good MC experiences were in anthropology, in which I was (fortunately for my grades) interested enough to study. Difficult tests, very skillfully written questions (you really needed to read and understand the assigned books), no joke answers that improve the guessing odds. Work hard to get 80’s, bulk of class in 60’s.

    Maybe it is my old-fashioned belief that tests should discriminate very highly based on understanding, coupled with my experience of being subject to poorly-designed MC, but I prefer a testing format where the jokers get close to zero and you really can tell the difference between the 80% students and the 90% students – even in freshman year (before the students sign onto a program where 2-3 more years will be wasted before the aptitude differential culls them and they replace another 2+ years of future income with another 2+ years of student loan debt).

    • Massimo Says:

      If I poured my heart and soul into a class and got a 92% or 95% or such, while the others jerked around and many got a mark in the 80-90 range, the premium for effort was simply not large enough to be worth it

      But you are talking as if the purpose of the test were that of clearly separating students with superior abilities, or greater motivation, from the rest of the pack. I do not see it that way. I think the former will have plenty of opportunities later on to show that they belong in a different league, starting with junior courses. Introductory physics is not the venue for that, in my opinion.
      And again, let me assure you that problem-based tests can be just as laughable a joke, when you have teachers who give partial credit for essentially re-writing the question or filling a page with nonsense. That is the main reason MCTs are so hated by many students (want to take a look at my evaluations when I use them ?), because they take away the possibility of bs-ing their way to a pass.

  9. Toni Says:

    The main objections to MCT are not the ones you mention. My main objections are:
    – MCT cannot test complex tasks, like deriving a mathematical proof
    – MCT confronts the student with a series of small, independent tasks. It doesn’t allow the student to engage with a complex problem. It doesn’t promote in my view the skills required for actual problem solving. MCT is good at testing knowledge. Mastering physics surely isn’t just about learning definitions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: