When I started teaching, in 1996, I never imagined that I would some day make use of multiple choice tests (MCT). To me, multiple choice would be a tool to conduct surveys, possibly suitable to test one’s proficiency with road signs and traffic rules, or other subjects requiring simple memorization — surely not a person’s mastery of a subject like physics.
Why ? Well, come on, isn’t it obvious ?
How would I ever be able to assess knowledge of basic concepts and their comprehension, logic, reasoning, computational methodology, if all the student has to do is check the right box, thereby picking one out of a few, pre-packaged answers ? And how would I know if that choice was arrived at by competence, or sheer luck ? How could I assign zero credit for accidentally checking the wrong box despite working out the problem correctly ?
Clearly, the correct way to test a physics student would be to have them write solutions to problems. That is what I thought back then.
As an assistant professor here, I started teaching large freshman physics sections, with enrolment in the hundreds and no help to grade papers. I had no choice but adopt MCTs, which I did begrudgingly. The past umpteen years of teaching, making use of MCTs, have radically changed my opinion on this subject. I do not just think that equally effective multiple-choice tests can replace problem-based exams, probably at all levels of instructions, I happen to believe that for specific courses, such as freshman physics, MCTs are far superior for a number of different reasons (grading easiness being only one of them). In fact, plausible and sensible as the above remarks on MCTs may seem at first, they are all flawed, and in any case problem-based exams do not address satisfactorily those shortcomings, and arguably suffer from even more serious ones .
I am dividing this post in two parts. In the first part I shall be discussing the supposed merits of problem-based tests, and their most important limitations. I am using physics as the specific subject, in order to keep the discussion concrete and because it is the one that I happen to teach. However, I do believe that many of the remarks apply quite generally, even outside science.
Problems and solutions
A physics problem-based test consists of students writing down solutions to assigned problems, each formulated as a question, in a given allotted time frame. What is tested is one’s ability to place the question in the right context, make use of the appropriate quantities and relevant equations, work out the algebra that is needed, and arrive at an answer, generally the numerical value of a physical quantity (expressed in the right units).
Well, so far so good, right ? How could anyone object to that ? The problem is that, while the principle is surely commendable, its actual implementation proves quite tricky, if one aims at devising a testing scheme that is effective, fair, and most importantly conducive to learning.
The difficulty with evaluating someone’s work is precisely that one is looking at a relatively complex solution, including wording as well as algebra, in different proportions depending on the individual student. No two solutions are identical, which in turn means that there is a significant degree of subjectiveness in the evaluation, even in a seemingly cut-and-dry field as physics.
For one thing, even though it does not aim at probing one’s writing skills, they have a serious impact on the test. Everything else being the same, students with an inferior writing ability are put at a disadvantage. Secondly, instructors looking at a student’s approach to the problem, will inevitably compare it to what constitutes in their mind the correct way, and often times grade accordingly. A student who may have obtained the right answer in the end, but whose procedure should be regarded as not fully convincing, or in the eye of the instructor as lacking important details (read: not identical to that which the instructor illustrated in class), will typically receive only partial, not full credit (e.g., 8 out of ten possible points for a single question).
In other words, the same test can be given different grades by different instructors, something which many of us regard as undesirable, as potentially unfair to students. But that is not the main drawback of problem-based testing, serious as it undoubtedly is.
The “partial credit” quandary
What is expected of a student is more elaborate than a mere numerical answer, and points may be taken off if the solution is deemed incomplete or partly inaccurate. By the same token, it is also broadly accepted that some partial credit ought be assigned in those cases where the student did not arrive at the final answer, but worked out a few intermediate steps. The rationale is that that shows at least some minimal knowledge or understanding of the subject. In other words, anything is deemed preferable to the student returning a blank sheet. Again, this seems hardly debatable.
Unfortunately, I am becoming convinced that subjectiveness in grading and partial credit  contribute to furnishing a perverse incentive to students, in turn promoting a shallow, content-free and ultimately flawed approach to science and learning on their part.
Anything will do…
What am I increasingly observing, specifically, that makes me think that way ?
We all have been students. We know how hard it is, especially the freshman year. Students are under tremendous pressure and punitive workloads. They register for five classes a term, have to complete weekly homework assignments for all of them, and prepare for five midterms and final exams at the same time. Trying to cut corners is human.
A significant fraction of students, these days, see partial credit as a viable strategy to pass, with the least amount of work, courses on which they do not plan on spending the time that they would really need. Out of every subject presented to them, they try and retain only the one aspect which, in their mind, will constitute the “meat” of a test problem, worth the bulk of the grade, and simply ignore the rest. Problem is, what is ignored is often times just as important .
Case in point: Units and dimensions
“I got the right value, why are you taking five points off ?”
“Sir, please note that the problem asks you for a length. In your answer, you are expressing it in kilograms…”
“Yeah but… five points ?”
The value of any physical quantity can only be meaningfully expressed in appropriate units, reflecting underlying physical dimensions. If I were to pick the single thing that a freshman student were to retain of a physics course, I would choose this. If students do not understand that a length cannot be expressed in kilograms, that force has dimensions of mass multiplied by length divided by time squared, that quantities with different dimensions cannot be added together, and that the first check of the correctness of the answer is dimensional consistency, I do not care how well they have memorized the rest, and how proficiently they can apply formulae and do algebra — they have not learned anything.
However, in a context where students are tested by means of problems, physical units and dimensions are often times simply ignored, the rationale being “Come on, how important can that crap be ? One point out of ten at the most…” .
This is just an example of what, in my opinion, is the worst drawback of problem-based tests. A problem aims at probing too much at once. Because partial credit must be assigned, major holes in one’s background can go undiagnosed.
As I shall try and convey in the second part of this long and boring post, MCTs allow one to design questions targeting specific aspects. Knowing that they could be tested on those aspects is a powerful incentive for students not to ignore them.
 Disclaimer: what is written here is only drawn from my own personal experience. I am sure that there exists a body of scholarly research on this subject, but this is just my blog, not a scholarly research journal.
 There are some (very few, in fairness) students who take the notion of partial credit to an extreme. The idea is simple: at test time, simply write stuff, attempt to give a deceiving impression of knowledge and reasoning by filling a page or two of essentially random text and formulae, with little or no regard to their relevance to the problem that they are supposed to be solving (this includes transcribing verbatim lines of text memorized off the textbook, or in case of open-book exam, copied on the spot) — then complain bitterly if that does not earn you at least 50% of the credit for that problem.
The solution to a physics test problem written by someone who has spent the time needed to study the course material, chewed on it, practiced his/her knowledge by working out exercises, is typically succinct, clear, precise, only includes relevant information, and conveys to the grader a sense of understanding and mastery of the subject.
 Yes, I know, I am the professor and can take off as many points as I like. That is not the issue. By the time it comes to that, it is too late. A good testing scheme should encourage students to study the right way.