Imperfect, but measures nonetheless

Evaluating scholarly work, as well as the potential for future success of an academic at a research university, be that a professor, a postdoctoral associate or a graduate student, is certainly a complex proposition, one that regularly results in bitter disputes among academics.
In academia, one’s fortunes at various career stages heavily depend on the assessment and recommendation of relatively few individuals with different degrees of influence, from supervising faculty, to senior colleagues, to prominent peers. For a long time, however, the need has been felt to incorporate in the evaluation more objective criteria, based on independent measures of productivity, as well as impact of one’s activity over a field of inquiry.

In the sciences, the most important such measure is arguably the number of publications in peer-reviewed, international journals, with an eye to aspects such as the number of authors per article, order of authorship, journal impact factor etc. An obvious problem with a mere counting of papers is that it trumps quality at the expense of quantity.
Lately, therefore, increasing weight is also ascribed to criteria aimed at assessing the actual relevance of such publications, and it is generally accepted that the number of citations that an article receives over time, is an indicator of its impact. Hirsch’s h-index is an attempt at capturing in a single number both raw productivity and how influential one’s work is to others in the same discipline. Other quantitative measures include the number of invited talks (especially at major international conferences), the success at securing extramural funding for research, scholarly prizes, memberships in prestigious societies, number of students and postdoctoral associates supervised and so on.
The utilization of such measures is often bitterly resented by scientists and academics, who see it as an attempt, typically on the part of bureaucrats, to reduce to a few numbers (or even just one) a complex and multi-faceted endeavor like scholarly research. The point is frequently made that any such measure is inevitably biased, incomplete, easily skewed by factors that have little to do with whatever the measure is aimed at assessing.

I think that the majority of us should welcome and support broad adoption of this kind of indicators, as long as they are properly utilized (just like everything else). If anything, my main objection is that all attempts at quantitative measures formulated so far do not go far enough, and are still overwhelmingly influenced by pervasively accepted (stated or implied) rankings of institutions and individual researchers.

To be sure, the general objections mentioned above have some validity. In physics, for example, tenure-track assistant professors enjoying generous start-up funds will be able to establish sizable research groups, in turn churning, er, producing publications and receiving citations at a higher rate than their competitors with access to more modest resources. Their greater productivity will also generate later on additional research funds, invited talks and so on. Other factors, such as geographical location, access to a large pool of graduate students, proximity to major experimental facilities, as well as operating from within a more prestigious institution, establish almost immediately productivity gaps between scientists. These gaps are only likely to widen with time, as the more successful and productive members of the community will also elicit greater interest and recognition (quite understandably), hence even more success. It is, in essence a “self-fulfilling prophecy”, and all existing numerical measures will do nothing but reflect such a state of affairs. Similar considerations hold for graduate students or postdoctoral researchers at more prominent universities [0].

Numerical indicators should clearly always be used carefully, and are no replacement for a more comprehensive assessment, also based on the opinion(s) of experienced researchers who have had the chance of observing someone “in action”, ideally over extended periods of time. However, for all their unquestionable faults and limitations, indices such as the h have the merit of at least moving a first, albeit timid step in the right direction, i.e., mitigating against the pressure of a highly stratified, hierarchical academic establishment, which feeds off a system of evaluation in which only (or mostly) the opinions of few matter. Indeed, the introduction of the h-index was aimed mainly at taking emphasis away from the journal, to place it instead on the actual impact of individual articles, as assessed by the number of citations. This is precisely because of the widespread feeling that acceptance of an article submitted to a prominent journal is largely a function of the prestige of the authors and/or of their affiliations [1].

The adoption of such indicators, which may act at least in part as an equalizer, may well represent the only hope of many a talented, especially young researcher who operate at semi-obscure places, to see their work recognized (Disclaimer: no, this is not self-referential… I am neither particularly talented nor young). And this issue is of crucial importance for graduate students and postdocs at second-tier institutions; their only decent chance at a research career rests with their publication and citation record, for in all likelihood the letter of recommendation of their supervisor alone will not take them very far. But of course, there must also be a formal system in place whereby their productivity will be acknowledged, recognized and impartially compared to that of their competitors by search committees everywhere, or all that they did will amount to nothing.
In order to get a taste for how bad things can get, one need only look, for example, at the italian university system, which has been the target of intense criticism for decades precisely because of the almost complete arbitrariness with which hirings occur, typically in blatant disregard of any internationally accepted measure of academic productivity or success (see, for instance this article on the subject).

Again, the use of indicators is not by itself the solution; it has problems, we all agree on that. For example, there will never be any guarantee that one’s published work will be cited, especially if it is published in a less widely read journal [2]; researchers at smaller institutions will always be at a disadvantage, and so on. Still, trying to emphasize someone’s actual record over what so-and-so-big-shot-professor says, seems like a worthwhile goal; making academic evaluation less of an “old boys club” function, whereby only the few elects are hired, publish on prominent journals, are funded and win prizes, is likely going to benefit the academic enterprise as a whole.

Notes
[0] I completed my PhD at a good, but not top ranked physics department, and then moved on to a prominent one for my first postdoc. One of the things that struck me right away, was how much easier it suddenly became to get papers accepted in Physical Review Letters. Many friends and colleagues, having followed a career path similar to mine, have told me to have noticed the same.
[1] For example, regarding the case of scientific fraud involving Bell Labs’ Hendrik Schön, some attribute to the reputation of both the laboratory at which he was working, as well as of his senior collaborator, the fact that so many of his papers were less-than-thoroughly reviewed. See, for instance, this article.
[2] Actually, in the days of ArXiv, Google Scholar, WebofScience and many others, there is really no excuse not to cite the work of others.

Tags: Academia, Physics, Research, Science, Society

This entry was posted on February 23, 2008 at 10:45 pm and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Exponential Book