Last week my friend Sanjoy came in Pisa to visit us and give a three day long seminar. At dinner with a few colleagues, we starting discussing about academic careers in Italy, and how difficult it is to obtain a position (currently, there is none open). My younger colleagues were discussing “how many papers you need to get a position”, a common “game” among young researchers, and Marco observed that no earlier than 10 years ago, the average requirements (and the expectations) were so much lower than today: a couple of journal papers were enough for becoming an assistant professor, 6 journals for associate, 12 for full professors. Now, 10 journals may not be enough for an assistant position! Seems that people are publishing much more, and much more frequently, and correspondingly the limits are getting higher and higher. I will not get into the discussion of why this is happening and if it is good or bad (maybe in a future post).
Inevitably, we ended up talking of the Hirsch index (or h-index) for evaluating researcher performance. This index is very popular, although it has received a lot of criticism. The definition is:
A scientist has index h if h of [his/her] Np papers have at least h citations each, and the other (Np − h) papers have at most h citations each.
In practice, you need to count the citations to each one of your papers; then sort the papers in decreasing order of citations; then find the index h of the paper that has no less than h citations, while the h+1-th has less than h.
The popularity of this index is probably due to the fact that it is easy to calculate and easy to understand: many on-line database offer a service for calculating it automatically. There are also many critics of the h-index, and I am one of them: it depends on the researcher age, so it tends to underestimate the performance of your researchers; it tends to overestimate people that publish a lot; it strongly depends on the research area; it also depends on the database [1].
Many other performance indexes have been proposed and many more will be in the future. Why? Why so many efforts in trying to measure the performance of academic researchers?
One of the main reasons is exogenous to the academic world. Politicians try to allocate money to the best researchers and to the best groups, so it is important for them (that have no specific background to directly evaluate researchers) to obtain an “index”, something that they can use right away to compare individuals, groups, departments and universities. The Italian government, in particular, is finally building up a national evaluation process for universities and departments, and a good, robust performance metric (if such a thing existed) would be of great help.
Let’s focus on measuring the performance of a researcher. An important question is: should we consider the h-index a good measure of the academic performance? For example, if a researcher has published only 3 papers with a large impact, with 1000 citations each, the h-index will be just 3. On the other hand, consider a researcher that has 20 papers, each one with 20 citations, his h-index will be no less than 20. Therefore, this index seems to favour researchers with lot of good papers, although maybe none very fundamental.
It is the old difficult question: quality of quantity? Then, Sanjoy pointed me to this article. Here is an extract:
The psychologist Dean Simonton argues that fecundity is often at the heart of what distinguishes the truly gifted. The difference between Bach and his forgotten peers isn’t necessarily that he had a better ratio of hits to misses. The difference is that the mediocre might have a dozen ideas, while Bach, in his lifetime, created more than a thousand full-fledged musical compositions. A genius is a genius, Simonton maintains, because he can put together such a staggering number of insights, ideas, theories, random observations, and unexpected connections that he almost inevitably ends up with something great. “Quality,” Simonton writes, “is a probabilistic function of quantity.”
Yes, I think that quality is a probabilistic function of quantity (the key is in the probability). It was true for Bach, Leonardo Da Vinci, Mozart, Newton, Gauss and Euler. However, sometimes it is not true; Einstein is maybe the best example: he published a relatively low number of paper with an extraordinary impact. Also, many mathematicians fall in this category (with the notable exception of Erdos). In conclusion, I think we may find many examples of genius for which quality = quantity, and many examples for which quality != quantity. I think that Simonton concentrates on one very specific aspect of genius. But this concept is difficult to define, capture, encapsulate.
Going back to the h-index: if we are in search of the pure genius, then the h-index is probably of no help; an academic genius (especially a young one) can be recognised by his peers without any index, and can be missed by any index. A performance index is probably more necessary to evaluate mediocre researchers from the bad ones (and we also need mediocre researchers!); the problem is to find the “perfect index” (if such a thing exists…)
[1] Lutz Bornmann and Hans-Dieter Daniel, “The state of h index research. Is the h index the ideal way to measure research performance? DOI: 10.1038/embor.2008.233.