Archive

Posts Tagged ‘h-index’

IITs not world class — but just how good are they?

May 24, 2011 Leave a comment

`IIT faculty are not world class’  said Jairam Ramesh, Minister of Environment. Ramesh, himself an IIT graduate, also mentioned the IIMs together with the IITs, and his  lament was about the quality of research done at these places. He believes that the IITs do not do good quality research, and thus does not attract the best people.

I decided to take a quick look at the research record of the IITs, as measured by the Web of Science (subscription required). The way I did it was to search for Kharagpur in the Address box, then refine it by Institutional Affiliation to the IIT, and break the search in two parts, one for 1986-2000 and the other for 2001-2011.

I was slightly surprised by the results — about 4500 papers for the first lot, and 9500 papers for the second lot. So the number of papers has doubled. The citation count was about 24k? for the first lot and 37k? for the second lot. And the h-index for each lot is about 60. This was somewhat higher than I was expecting, but perhaps not that high, given that IIT Kharagpur has about 470 members of faculty — so 9500 in 10 years is about 2 per faculty per year, an acceptable number. But I am not sure how to interpret the number of citations or the h-index. I would have expected somewhat lower figures for a `bad’ research institute of this size and age, so perhaps it’s not that bad. But how good is it?

Perhaps I’ll do the same search for the other `old’ IITs and try to find some meaning in the figures.

With all the caveats about indices, of course.

Indices again

April 23, 2011 1 comment

Via Nanopolitan: Current Science carried a letter by Diptiman Sen and S. Ramasesha, two physicists at the IISc, pointing out that the h-index is not a good `scientometric’ index. Unfortunately, the arguments they have used to establish that are not all equally good. In particular they suggest that the Nobel Laureate V. Ramakrishnan has a low h-index. This was picked upon by two other scientists who point out the actual figures are not particularly low. And in between poor arguments and rejoinders, many other important arguments against using the h-index got lost.

Some of these other arguments were given here and here, and many more can be found all over the cyberspace and blogosphere. My arguments against scientometric indices in general, and h-index (and journal impact factor) in particular, are similar to those given in these links, but also try to take into account the conditions special doing science in India. These are, not necessarily in any particular order, are the following.

1. Most (or all?) such indices are based on only the number of citations (total or per year), so no matter how a metric is designed, it is only the number of actual citations which enters the metric — so any of these metrics is ultimately calculated from only one parameter. Or perhaps two, since the rate of growth of citations may be included. Some metrics would include the number of papers (total or per year) as well. That is another parameter, but the number of citations is not completely independent of the number of papers, and the number of papers is usually not a good measure of their quality, so including that number does not improve the quality of the index either.

2. Even if different ways of using the citation count (and paper count) lead to qualitatively different indices, the standard indices are still numbers associated with only one individual (or one institution), the one being evaluated. This cannot make any sense, since high or low values may be systemic. For example, mathematics has fewer papers than  medicine, than even specialized branches of medicine like oncology, and consequently fewer number of citations as well. So any index that can be applied to both mathematics and medicine will have to take into account its behavior specific to that field, and thus require some sort of comparison within the field. This is never done as far as I can gather, either in the construction or in the usage of these indices.

3. Even if we can make a comparative index, for example by taking ratios or percentiles within a field, that is not likely to hold up against historical data.  That is, given some index — h, g, or whatever — for some string theorist, we can come up with a `normalized’ one by taking its ratio with the same index for Witten, but the same normalization is not likely to make any sense for Born or Einstein, say. Of course they were not string theorists, and neither is `normal’ a word one should use for any index related to Witten. Still, the explosion of citations is a relatively recent phenomenon, and related to the explosion of papers, so the variation of any index with time — for individuals as well as within fields — need to be taken into account.

4. Many scientists work across disciplines, many more work across subfields. It makes no sense for such people’s work to be evaluated by a single index, as the index may have different ranges in the different fields or subfields. For example, someone working mostly in mathematics and occasionally in string theory may end up with an index which is low compared to string theorists and high compared to mathematicians. How should something like that be used?

5. Indices are used for different purposes at different career stages. So it does not make sense to use the same index for people at different stages of their career.

6. There may be `social’ factors in the rise of citation count of specific papers or individuals — some are obvious and `nearly academic’ ones, like the popularity or currency of a field or a problem — the bandwagon effect. Then there is the `club’ effect — I cite you, you cite me, friends cite friends — which can work wonders in small subfields. There may also be less academic and more career-oriented reasons — it is very likely that papers cite probable referees more often, so that a paper does not come back for revisions simply because `relevant literature was not cited.’ I would not be surprised if this mechanism gets reinforced for people with many collaborators — a paper might be rejected if it did not cite the papers of the referee’s collaborators.

There are also several issues special to Indian science, which have to do with how appointments and promotions are usually effected in India. As noted by G. Desiraju in the letter to Current Science,

it was possible, in the days before we had scientometric indicators, for committees of wise men to simply declare an incompetent as an outstanding scientist.

Unfortunately, it is still possible. But that is another discussion.

Vanity Index II

July 26, 2009 2 comments

I missed a couple of things in my previous post on the h-index and citations.

  1. Michael Nielsen points out that the h-index is essentially redundant, because for most scientists it is nearly equal to half the square root of their total citations. So the h-index doesn’t carry any more information that the total number of citations.
  2. The total number of citations greatly depends on the field of research. Medicine and related subjects get huge citations, mathematics very few.
  3. The same problem affects journal impact factors — medical and bio-technological journals have high impact factors in the 20’s and 30’s, while math journals languish under 2.

It seems like a sensible conjecture that impact factor of a journal increases with the total number of citations per year in the field. It may be the same function with different constant prefactors for different journals.

Total number of citations per year in a field should be a power function (n^2 seems likely) of the total number of papers written in the field per year.

Total number of papers should be directly proportional to the total number of researchers in the field. The proportionality constant may be different for different fields.

So the impact factor of a journal should be a function of the number of researchers in the field, multiplied by some constant factor depending on the journal. The function should be the same function for all fields.

On the other hand, if you want to have a high h-index, join a club where everyone writes a few (or several) papers every year, and cites everyone else in the club.

Categories: Education, Physics Tags: ,

Vanity Index

July 13, 2009 3 comments

The scientists at a local research institute (that shall remain nameless) spent the last couple of weeks calculating their h-indices. Some of them are probably still doing it.

While the impetus for this research came from the top — they were asked to submit their h-index, citations and total impact factor every year for the last five years — it seems that many if not all took a real interest in calculating their own h-indices and everybody else’s, too. And comparing them, of course.

The h-index was of course at the centre of discussions. Few of the scientists knew the definition of the thing, and often misunderstood the definition even when it was explained to them. Not that it is particularly difficult — your h-index is the largest number n such that you have n papers with n or more citations each.  The way to calculate this is to arrange one’s papers in descending order of citations. Then your h-index is n if at the n-th place in this list you find a paper with n or more citations, and at the (n+1)st place you find  a paper with less than (n+1) citations. So if your h-index is 10, that means you  have 10 papers with 10 or more citations, but not 11 papers with 11 or more citations.

Of course you need to know the citations of your papers in order to do this. For high-energy physics, gravitation and cosmology or astrophysics, there is SPIRES. Or you could go to Google Scholar. There are also a couple of expensive options, the ISI Web of Science (WoS) and SCOPUS, neither of which may be accessed without an institutional subscription of about Rs.10 lakhs per year.

Of these SPIRES is undoubtedly the best in many ways, if — and that is indeed a big if — you stick to only high-energy physics, and maybe gravitation and cosmology. Astrophysics was included in their database comparatively recently, I am not sure how well astro papers are covered. But if your paper is cited by mathematicians — not particularly unusual — that citation is unlikely to get into SPIRES. Google records almost all papers and citations (yes, `almost’), but records many things more than once, so it is considered rather unreliable. WoS does not record citations from conference papers or unpublished preprints. I have not used SCOPUS, but I have been told by those who have that it is not better (in the sense of being more inclusive) than WoS.

The bosses generally prefer WoS, because a) they don’t do high-energy physics, and b) like many people they believe that if you pay lots of money for something, what you get is worth the money you pay for it. What they want is not just a number, they want to feel good about themselves, and they want something to boast about. If they have a high h-index or citation count, they feel comfortable advising the (relatively) junior scientists what they should work on, with whom, and so on, without being told to jump into the nearest body of water. Of course they are mostly safe from such abuse anyway, since the Indian middle class is unduly polite to people in power, and service rules of government scientists can be used to take disciplinary action against those who dare utter such things. In fact at the institute of this story the scientists were threatened with disciplinary action if they failed to submit the numbers. So the bosses don’t bat an eyelid when told that they have to squander millions of rupees of public money in order to get a boost to their vanity.

And they groom the next rung of scientists into this habit of comparing their h-indices the lengths of their citation lists.