Do research and teaching go together ? Do accomplished researchers generally also make (more) effective classroom instructors at the post-secondary level ? Do the research experience, and the first-hand, in-depth knowledge of a subject that one acquires in years, decades of investigation (e.g., in the laboratory, or at the computer) have a noticeable effect on one’s ability to convey general, basic concepts to students, especially in a classroom setting ?
Personally, the longer I am in this profession, the more I become convinced that the two should go hand hand. Why is that ?
Well, the way I would put it is the following: attempting to elucidate what is unknown, has the effect of broadening, reinforcing and consolidating one’s comprehension of what is known. And ultimately one’s ability to teach something, directly reflects one’s comprehension of what is being taught. To put it differently, the inability of many professors to explain things clearly, is often times attributable to the simple fact that they themselves have not fully understood them. That is why they fumble on the oddball question (the one not answered on the textbook), cannot think of examples or test problems on their own and constantly resort to data banks, they are unable to present things in a different way from the “canonical” one, and so on.
I cannot help feeling that a person who has carried out a research project in a specific area has explored aspects of a subject, and understood its connections with other areas, that are simply not illustrated (or emphasized) in textbooks. There is no question that the subject matters that I feel most comfortable explaining, are the ones in which I have engaged as a researcher. Furthermore, it seems plausible that being active in research should impart to one the ability to relate even the most elementary topics to cutting edge fields of research, in turn conceivably generating enthusiasm and interest among students.
Of course, the above is just my opinion, one possibly shared by a large fraction of my colleagues, but not one founded on actual data (there being, to my knowledge, no actual data upon which to found it).
In any case, just like in any human endeavour, no one should expect any rule to be “hard-and-fast”. We all have had in college, or even in graduate school, brilliant teachers, or at any rate instructors who we felt were very effective, who were not especially accomplished researchers. We all have heard it stated many times, “a great researcher does not always a great teacher make”. And there seems to be no reason why a person with a solid background in some general field of study, having gone through years of formal education, and embodying that elusive blend of above-average intellect, communicative skills, personal charm, charisma and what have you, may not succeed at explaining concepts — perhaps even more so than many others with superior investigative abilities.
But shouldn’t there be at least some connection between the two ? Isn’t the merging of, and interplay between research and teaching, one of the tenets of university education, not just in North America but to some degree virtually everywhere else in the world ?
Answering such a question in a way that may even be regarded as tentatively objective, never mind scientific, seems like a hopeless task, primarily and fundamentally because of the difficulty of arriving at a universally acceptable definition of what makes one a good “researcher”, or (even worse) a good “teacher”.
It is all about ’em numbers…
There is no question, however, that these days people are eager to measure — no, to assign a numerical value to everything. University professors are evaluated on both their teaching and research effectiveness by means of procedures which ultimately are summed up in a single number. In the case of teaching effectiveness, that number (typically from 1 to 5) expresses the average student overall satisfaction with that teacher, and is the main outcome of student evaluation of instruction (SEI).
On the other hand, Hirsch’s h-index, also a number (in this case a non-negative integer) is rapidly becoming a popular tool to assess a researcher’s standing within the community, as captured by the person’s citation record.
Let me state at this point what this post is not about: I am not going to discuss here the merit, validity or reliability of either number. Make no mistake, I do have an opinion on that (how could I not), but that is not the point of this post. The fact is, both numbers are used for the above-stated purposes, and if they surely do not and ought not tell the whole story, they are nonetheless taken seriously as reliable broad indicators. Based on this premise, it appears therefore possible to make a first attempt at providing a quantitative, objective assessment of the relationship between teaching and research, by studying how these two numbers vary together, if at all.
This is what this post is all about: are researchers with a high h-index more likely to score better in SEI ? Is there any correlation between the two ? I have attempted to establish just that by carrying out a statistical analysis of the record of a sample of one hundred university professors. The rest of this post describes the methodology, results and conclusions of this (amateur) study. I am obviously very interested in any similar study conducted by “professionals”.
One of the interesting things about these two numbers, is that they are both effectively publicly available. The h-index of a researcher can be obtained through WebOfScience.
Obtaining a professor’s SEI average is in principle not as straightforward, as many (most) universities do not publish that information (nowhere easily accessible anyway). However, there exists a web site called RateMyProfessor (RMP) where students can freely and anonymously rate their instructors. Even though I am not sure how controlled, careful is the way in which such opinions are collected , I think it is safe to expect that, for professors receiving a reasonably large number of ratings, the average ought approach that of SEI as formally administered at the person’s institution — that is surely my personal observation, and I am aware of at least one study that claims to have reached that conclusion.
So, I went online, and armed with all my patience I collected data for one hundred university physics professors based in the US and Canada. I chose physics as a discipline because it is my own, and that made it easier for me to resolve some ambiguities with a proper assignment of the h-index in some cases (see below). I see no obvious reason why physics should be different from any other scholarly discipline, in that respect. I selected professors to be included in the sample by picking the university first , going through the names of instructors in the physics department, as listed on the RMP web site, and considering each and every one who received at least twenty ratings, on the assumption that that would be sufficiently large a number that the average should be fairly robust.
I included in my sample individuals who are currently listed as members of the faculty (assistant, associate or full professors) in the web site of the physics department of their institution, i.e., I did not include lecturers, retired or adjunct faculty. For each one of these individuals I recorded the “overall quality” average (a number between 1 and 5), as well as the h-index, as retrieved off WebOfScience . I stopped as soon as the size of my sample reached one hundred. In the end, it includes names from twenty-three different universities .
It seems a reasonably large and unbiased sample to me, but I am no statistician and therefore welcome any criticism.
The sample averages are:
RMP student evaluation “overall quality” measure (RMPOQ): 3.2 with a standard deviation of 0.8.
Again, the RMPOQ measure is a number between 1 and 5, and the fact that its sample average falls so close to 3 seems to support the notion that the sample is large and unbiased. The sample median is 3.15.
h-index: 30 with a standard deviation of 15. Values of h-index are scattered fairly widely, the lowest being 4, the highest 66. I have also noticed for the first time something that others had mentioned to me before, namely that h-index take on different values, on average, in different sub-fields of physics. However, it is not clear to me that that should affect its correlation with teaching. The sample median is 29.6, very close to the average in this case too.
The above scatter plot shows all the data points in the sample, and basically tells the whole story, as far as this particular exercise goes. Let us go through it in detail. Each red dot represents a professor in the sample, with his/her value of RMPOQ (horizontal axis) and h-index (vertical axis). The two lines crossing in the central part of the plot represent the two average values for RMPOQ and h-index. These two lines divide the plot into four quadrants.
If the two variables considered here (RMPOQ and h-index) were somehow “correlated”, then one would expect red dots to gather in one or two of the four quadrants. If, for example, a professor with a higher-than-average h-index were also more likely to have a higher-than-average RMPOQ, then dots would cluster preferentially in the upper right quadrant, leaving the upper left one “emptier”.
The correlation could be even stronger, namely professor with a lower-than-average h-index could also be more likely to have a lower-than-average RMPOQ, in which case the upper right and bottom left quadrants would feature most of the dots. Alternatively, if the two variables were “anti-correlated”, namely if a higher-than-average value of one meant a likelier lower-than-average value of the other (and/or vice versa), then dots would fall more often in the upper left and/or bottom right quadrants.
On the other hand, if the density of dots in the four quadrants is very nearly the same, one ought conclude that the two variables are essentially disconnected from one another, that knowing the value of one gives no clue as to the value of the other.
This, in turn, carries the strong implication that, whatever it is that one of the two variables measures, has little or no relevance to whatever it is that the other one is believed to assess.
Now, it must be stated clearly that, when it comes to social phenomena, correlations are usually weak, i.e., one is often in the situation in which the density of dots in the four quadrants is only slightly different. The data in this sample, however, seem to indicate an exceedingly weak correlation. If we try and count how many red dots there are in every quadrant, we find very nearly a quarter thereof (i.e., approximately 25) in each one of them.
In other words, RMPOQ and h-index are ostensibly unrelated. Assuming that this is not due to some artifact, i.e., some hidden bias in my methodology, or other errors on my part (which is fine, this is just a blog and I am merely an amateur), one may wish to try and understand why the correlation is so weak .
Obviously, the most direct conclusion is that the two variables are indeed independent, i.e., there is no correlation between one and the other . This is not something to take lightly, however, it seems to me. If indeed the two quantities considered here furnish a cogent, relevant assessment of what they purport to measure, namely teaching and research effectiveness, then the conclusion would be inescapable that a good, experienced researcher is no more likely than a mediocre or inexperienced one to be an effective instructor. It would also mean that, if excellence in both research and teaching were a requirement for promotion and tenure in academia, one ought expect that, in general, only about a quarter of all professors would pass the test…
Now, I have often heard colleagues and people outside academia alike, lament the fact that research is an impediment to quality teaching, as active researchers are not interested in teaching, largely see it as a hinderance to research, try to “get done and over with it”, often doing a sloppy job. Well, the above study does not support such a contention either, in that the h-index and the RMPOQ do not appear to anti-correlate any more significantly than they correlate. In other words, there is a 50/50 chance that either a good/bad researcher will be a good/bad teacher. Perhaps the two activities hinge on different, possibly even antithetical qualities, abilities, character traits.
Anyway, if being a good researcher were to be ultimately established, or came to be accepted as having no real bearing on one’s teaching performance, given the central role played by teaching in the mission of any institution of higher education (including research universities), this would constitute a powerful argument to swell the university ranks of teaching scholars, lecturers, or in any case academics whose main charge is teaching, not research.
Presumably, such an increase would come at the expense of the institutional investment in research, chiefly research personnel (not only faculty but conceivably also postdoctoral associates, graduate students, research support staff, and in general anyone whose job description does not contain a strong teaching component).
One may even imagine separating the research and teaching careers, as some university systems in fact do. The far-reaching consequences that this would have on the research effort of a nation can hardly be
Are we really ready to concede the above, though ? That teaching and research have nothing to do with one another ? That research ability may perhaps have an impact on graduate education, which has a strong research “training” component, but hardly makes a difference in the classroom ? Is the American university system, one so heavily based on research, in fact almost built on the premise of the marriage between research and teaching, not the most successful in the world, by any accepted measure ? Is it short-changing its own students by putting in front of them a class of instructors selected mainly based on research potential, who may or not have what it takes to deliver in the classroom ?
There is, of course, an alternative interpretation of the above data, namely, either one of the two quantities considered here, or possibly both, do not really measure what they aim at (or, claim to be) measuring. But that is for another post.
 For example, it is not at all clear to me how (or even if) the site prevents a single person from submitting several times a rating for a given professor, or whether it is possible for someone to make up phoney information and rate an instructor by whom the rater has not really been taught.
 I just picked names of institutions as I thought of them, in no particular order. I mostly tried to stick with research universities, but I also included one that is more oriented toward teaching. For quite a few physics departments (typically those top rated) I could not find a single instructor, at the RMP web site, with my minimum required number of ratings. My non-scientific observation is that students at large state universities, especially those not prominently ranked, are more eager to rate their professors than students at Ivy League schools.
 That is not always straightforward or unambiguous. In some cases, a person’s name is so common that it generates a lot of entries (i.e., published articles), and the output of the “Create citation report” function is not always reliable. Even though criteria such as the field of research or the institution can be used in principle to identify only the entries exclusively attributable to the person in question, in practice sometimes I could not resolve the ambiguity to my satisfaction, and therefore did not include that name in my sample.
 I am obviously happy to share the data that I have collected with anyone interested in checking them out. However, I have no experience with research conducted on human subjects, and am thus unfamiliar with any legal restriction on disseminating data such as these, wherein people are referenced by first and last name. I must emphasize, however, that I have not used anything that is not publicly available. RMP data are, of course, visible to anyone with an internet connection, whereas h-index data require a subscription to WebOfScience. I believe, however, that all one is talking about is a trip to a reasonable university library.
 One might argue that the h-index of a professor in the sample considered here can be lower than the sample average for two reasons: one is a lower than average research effectiveness (as assessed by citations), the other is the younger age of the individual, who may be very talented in research but simply may not have had the time to establish a rich citation record. Thus, the h-index in my sample is not necessarily a measure of research talent and ability, as much as a combination of that and research experience.
Even so, however, one would expect a more experienced researcher to be a more effective teacher, it seems to me.