Google Scholar Citations

Thanks to Bee of BackReaction, I have become aware of a new feature of Google Scholar, called Google Scholar Citations. It is essentially a free alternative to Web of Science (WoS), allowing researchers to create a public profile, with a list of all of their publications, including the citations garnered by each article.
(This is mine, by the way).

Below is an excerpt from the post on Google Scholar’s Blog, announcing this new service:

“… you can quickly identify which articles are yours, by selecting one or more groups of articles that are computed statistically […] we collect citations to your articles, graph them over time, and compute your citation metrics – the widely used h-index; the i-10 index, which is simply the number of articles with at least ten citations; and, of course, the total number of citations to your articles…”

I am sure that not everything that Google does is good, but I cannot imagine this new service not being welcomed by the vast majority (totality ?) of researchers. The notion of having to log into a secure site and pay money (indirectly through my institution) to access information that is really meant to be public, such as number of articles published by scientists and their citations, seems to me to be another vestige of a time past — like scientific journals printed on paper. Granted, it is clearly a way to suck more people into using Google services (one starts with Google Scholar, then tries out Google Apps, then Mail… but I guess it is called market strategy). Still, I like the concept and applaud the effort. I think that the scientific community badly needed something like this, and I hope that this tool becomes widely adopted soon.

The implementation may require some fine tuning, but, hey, we are patient, are we not ?
The one thing that I have noticed (obviously it is also discussed in the BackReaction post mentioned above, as well as here), is that the computed h-index differs from that yielded by WoS [0]. I have not carried out a scientific analysis, I am sure some nerd scholar will go through this exercise soon, but my “gut feeling” is that the difference should be typically of the order of 10-15%. Thus, I am not sure whether this is going to be a huge issue in the end, because numerical measures such as the h-index (of which I am a big fan), should be taken with a grain of salt. That means that an “error bar” should be almost automatically ascribed to any value of h, and I think 10-15% is probably a good rule of thumb anyway.

I have not (yet) read any attempt to explain where the difference between the h-index computed by Google Scholar and WoS may originate. Off the top of my head, I can think of one reason why Google Scholar ought typically output a greater number. If one searches on Google Scholar how many times one’s article is cited, Google Scholar lists all cites found anywhere on the Web, including citations to that article made on unpublished material, such as doctoral theses, as well as articles published in preprint form, typically uploaded to the ArXiv repository.

Now, my position on citations is expounded in this old blog post of mine: A citation is a citation is a citation — in the sense that it does not really matter who cites the article (including the author herself), nor where the citation occurs (for example, citations made in preprints or theses are fair game, in my view). However, one should be careful not to count the same citation more than once, and with Google Scholar that is a bit tricky. For, a citation to a given article made in a preprint, is counted separately from the same citation made in the subsequently published article arising from said preprint, i.e., the same citation is counted twice.
It seems to me that this should be easily fixable — and like I stated above, I doubt if a difference of this magnitude is even worth getting all worked up over [1].

Other issues ? Well, I have been doing some reading (mostly on blogs) over the past few hours, regarding problems with finding individual researchers on Google Scholars based on their last names. I think that the most obvious present limitation, is that they have to have Google accounts, but I am thinking that this will not be an issue for long (the whole world is getting Google accounts soon… right ?). In terms of differentiating individuals with the same last name, my case is similar to Bee’s, in that my last name is fairly uncommon, so, finding myself was a piece of cake. Obviously, in general it is far from trivial, but I do not think that it is any easier in WoS.


[0] In my case, Google Scholar gives 28, whereas WoS 25, the reason being that I have a bunch of papers cited 25 times which have already been cited a few more times in yet unpublished ArXiv preprints. Google Scholar counts those, WoS does not. I expect those preprints to become eventually published work, at which point WoS will add those citations. Problem is, and Google Scholar will add them too… again (see above).

[1] There — so much for “I shall never end sentences with prepositions !”…

Tags: , ,

11 Responses to “Google Scholar Citations”

  1. GMP Says:

    This is very cool. I have so far been using Harzing’s Publish or Perish ( which is also free and does collect citations from Google Scholar and essentially does the same thing, plus provides some other metrics. But I like the Google Citation profile seems quite fancy and it’s a nice thing to add to one’s website, for instance.

  2. Transient Reporter Says:

    Phases of lattice hard-core bosons in a periodic superlattice

    I prefer soft-core bosons.

  3. Azeem Majeed Says:

    I was able to obtain a Google Scholar Citations page when the service was first launched earlier this year. I already have both an Imperial College personal homepage that describes my academic work and a publications page that list my publications (Imperial College London & Imperial College London publications). I find the extra functionality in Google Scholar Citations very useful in tracking citations. I’ve now encouraged my colleagues in the Department of Primary Care & Public Health to also sign up for the Google Scholar Citations service.

  4. David Says:

    For a social scientist like me WoS and Scopus both give an h-index of 19 while Google Scholar gives 28. So the relative coverage of these databases varies by discipline. For biologists the h-index is likely to be very similar across the databases. For book based fields GS will give a much higher score.

    • Massimo Says:

      Well, to me the issue is, is the greater number yielded by GS accurate ? If there are legitimate citations that WoS and other databases miss, well, I say more power to Google. Citations in books and theses in my opinion are meaningful and legitimate, and should be counted. In my opinion, however, there is an obvious issue with double counting, at least in physics.

  5. TG Says:

    For me GS gave around 100% more citations than web of science. I think several of the main reasons were already noted — double counting arxiv & the refereed papers, and finding citations to a bunch of more obscure stuff that is not refereed (like patent applications). A few other observations — WOS seems to be very thorough in that it is fairly certain that citations from a paper that is published in a refereed journal will always appear. GS was missing some more recent references from refereed publications.

    These days citation metrics seem to be very important for academic advancement. In your opinion what is the proper number to report in, e.g., a tenure application? Some people are likely to use numbers from metric calculators which give the highest. If you spend a few days making your own, citations can be made (arbitrarily?) large (for instance include a +1 citation every time you cite your work in a formal presentation). The line needs to be drawn somewhere and it seems like restricting to only refereed publications is fair.

    • Massimo Says:

      citations to a bunch of more obscure stuff that is not refereed

      The fact that it is not refereed is immaterial, it seems to me. It may not be refereed, but it cited that paper and not another one.
      By that token, we should not count any citation on ArXiv preprints, but that does not seem to be justified, considering that citations to preprints are presently regarded as acceptable by most peer-reviewed journals. So, if we can cite them, why should we not take seriously the articles cited therein ? And the moment you accept to count as legitimate citations on some non-refereed sources, then where do you draw the line ? I also do not see why citations on books should not be counted, that makes absolutely no sense to me.

      WOS seems to be very thorough

      Well, it surely does want to give that impression, but it is not without its own flaws. For one thing, it is frustratingly slow, in that it sometimes takes months, even years, for published peer-reviewed articles to show up, sometimes for reasons that seem incomprehensible. That slowness can be a serious problem, especially for tenure-track faculty.
      Secondly, it is no less subjective and whimsical than others, in terms of including some unrefereed material (conference proceedings) — again, once you decide that you are going to include some unrefereed articles, why not all of them ?

      In your opinion what is the proper number to report in, e.g., a tenure application?

      Whichever you like, as long as you explicitly state by which one you are going. Making your own does not seem like a good idea, in that it would immediately raise suspicions, but if you use one of the few that are widely adopted, I do not see what the problem is. Also, let me re-iterate that no sane persons takes the h-index as the last word, the one and only definitive sign of research excellence — that is a ridiculous claim made mostly by those who oppose any kind of metrics, largely because they stand to benefit from a system of evaluation based mostly on personal connections, influence of one’s school and major professor, and so on. The h-index is merely a starting point. I do not believe that a single person has ever been, or will ever be hired exclusively because of a higher h-index.

      The line needs to be drawn somewhere and it seems like restricting to only refereed publications is fair

      I do not think it is, but, regardless of what you and I think, it is not what the community has decided to do. Citations to unrefereed material are commonplace these days, e.g., on Physical Review. If your (yet) unpublished preprint or your book chapter, garner a lot of citations, what does the (supposed) vetting coming from two distracted referees add to its credibility ?
      I think that this is one case in which Mark Twain’s contention that the public is the only critic worth paying any attention to, is really well-suited.

    • GMP Says:

      These days citation metrics seem to be very important for academic advancement. In your opinion what is the proper number to report in, e.g., a tenure application?

      At my institution in the physical sciences it seems pretty standard to include both WoS and Google Scholar numbers (for the h-index as well as individual article citation counts).

  6. TG Says:

    Thanks for the response. I agree with pretty much everything you say (I think it is good to include citations from work on the arxiv). The fact that google scholar is free is also great. My main gripe is the double counting of GS. I can’t in good conscience report (say on an application) that my article was cited 50 times when only 30 of these are unique even if indicate the results are from GS.

    • Massimo Says:

      Yup, on that specific issue you are getting no argument from me — they totally need to fix that. I think it is relatively easy a fix, though, and I expect them to do it soon. Comparing titles of articles citing the same paper should already weed out a large number of double counts. Also, if authors diligently updated the ArXiv entry once the paper is published, that would also help, I presume.
      At the same time, however, allowing one’s citation record to feature citations from preprints has the potential benefit of keeping it up-to-date, since it is typically months before the citing articles see the light and before WoS is updated.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: