April 12, 2012 6 Comments
Last week a new measure of the impact of a journal was launched: Google Scholar Metrics. So it seems like a good time to review the various metrics available for journals.
Below I summarise six measures of journal impact: the impact factor (IF), 5-year IF, Google Scholar Metrics, SCImago Journal Rank (SJR), Source Normalized Impact per Paper (SNIP) and Excellence in Research for Australia (ERA) ranking. As part of the research for this post I have found out the metrics (except 5-year IF) for a sample of 97 of the higher-impact biomedical journals and put them into a Google Docs spreadsheet, which can be viewed here (or on Google Docs directly here).
Most researchers get to know the IF fairly quickly when they start to read journals. So you probably know that an IF of 30 is high and that thousands of journals (the long tail) have IFs below 1. But fewer people have this kind of familiarity with the other metrics. So I have tried to estimate what range of numbers counts as ‘high impact’ for each metric. ‘High’ here means in the top 33% of my sample of 97 journals (already a high-impact sample).
To summarise the number that counts as high for each metric:
5-year IF: about 15
Google Scholar Metrics: 101
Note that I am only talking about journal metrics, not metrics for assessing articles or researchers. As always, anyone using these figures should make sure they are using them only to judge journals, not individual papers or their authors (as emphasised by a European Association of Science Editors statement in 2007). Also remember that citations can be gamed by editors (see my previous post on the subject or a recent Scholarly Kitchen post on a citation cartel for more details).
The IF is provided by Thomson Reuters as part of their Journal Citation Reports, which covers ‘more than 10,100 journals from over 2,600 publishers in approximately 238 disciplines from 84 countries’.
It is calculated by dividing the number of citations in one year of articles published in that journal during the previous two years.
What counts as big: the highest-ranked journals have IFs over about 14; middle-ranking journals have numbers between 3 and 14; many low-ranked journals have numbers around 1.
Five-year impact factor
This is similar to the standard two-year IF except that citations and articles are calculated over the previous five years rather than two. It has been published only since 2007. This metric has advantages in slower-moving fields, where papers gather citations more slowly than a year or two after publication.
It is difficult to find lists of five-year IFs online, although some journals display them on their home pages. I did, however, find a study in the journal Cybermetrics that showed it is generally about 1.05 times the size of the two-year IF.
What counts as big: 15 using this figure.
Google Scholar Metrics
These were introduced on 1 April 2012 and are based on the Google Scholar database, which includes more journals and other publications than that used for the IFs. They are based on the h-index, which is defined on the Google Scholar Metrics page as follows:
The h-index of a publication is the largest number h such that at least h articles in that publication were cited at least h times each. For example, a publication with five articles cited by, respectively, 17, 9, 6, 3, and 2, has the h-index of 3.
This is a rather difficult concept to get your head around (at least it is for me). Basically the number cannot be bigger than the number of papers a journal has published, and it cannot be bigger than the highest number of times any one paper has been cited. So in the above example the h-index cannot be greater than 5 because there were only 5 articles, and the largest number of citations lower than 5 is 3, so the h-index is 3.
Google Scholar Metrics extends this as follows:
The h-core of a publication is a set of top cited h articles from the publication. These are the articles that the h-index is based on. For example, the publication above has the h-core with three articles, those cited by 17, 9, and 6.
The h-median of a publication be the median of the citation counts in its h-core. For example, the h-median of the publication above is 9. The h-median is a measure of the distribution of citations to the h-core articles.
Finally, the h5-index, h5-core, and h5-median of a publication are, respectively, the h-index, h-core, and h-median of only those of its articles that were published in the last five complete calendar years.
So the main metric is the h5-index, which is a measure of citations to a journal over 5 years to April 2012.
Note that this metric doesn’t involve any division by the number of papers published by the journal (unlike the other metrics discussed here). This means that journals that publish more papers will have proportionally larger values in Google Scholar Metrics than with other metrics.
What counts as big: the highest-ranked journals have h5-indexes over about 101; many journals seem to have numbers under 50.
SCImago Journal Rank (SJR)
It expresses the average number of weighted citations received in the selected year by the documents published in the selected journal in the three previous years, — i.e. weighted citations received in year X to documents published in the journal in years X-1, X-2 and X-3.
So it is also a measure of citations similar to a three-year impact factor, but the citations are weighted according to where the citation was. Further information is here (pdf). The weighting depends on how many citations each journal gets. So if journal A is cited a lot overall and journal B is not cited as much, and a paper in journal C is cited in journal A, that citation is given more weight in the calculation than a citation of journal C in journal B.
What counts as big: the highest-ranked journals have SJRs over about 3; many journals seem to have numbers under 0.5.
(Note that on the SCImago website the decimal point in the SJR is given as a comma in some places, so it looks as if the top journals have SJRs of over 1000 (1,000). On the spreadsheets that are freely downloadable from the same site or from the ‘Journal Metrics’ website (also from Elsevier) the metrics are given as 1.000 etc, so I think this is the correct version.)
Source-Normalized Impact per Paper (SNIP)
The Source-Normalized Impact per Paper (SNIP) is defined as the ratio of a journal’s citation count per paper and the citation potential in its subject field. It is designed to aid comparisons between journals in fields with different patterns of citations. It is calculated as follows:
Raw impact per paper (RIP)
Number of citations in year of analysis to a journal’s papers published in 3 preceding years, divided by the number of a journal’s papers in these three years
Database citation potential in a journal’s subject field
Mean number of 1-3 year old references per paper citing the journal and published in journals processed for the database
Relative database citation potential in a journal’s subject field (RDCP)
Database citation potential of a journal’s subject field divided by that for the median journal in the database
Source normalized impact per paper: (SNIP)
Ratio of a journal’ raw impact per paper (RIP) and the relative database citation potential (RDCP) in the subject field covered by the journal
So basically a three-year impact factor is weighted according how much papers in other journals in the same field are cited.
When I looked for lists of SNIPs for 2010 I encountered a problem: two different lists gave two different answers. The list downloaded from the Journal Metrics site gives the 2010 SNIP for Cell as 1.22, but when I searched on the CWTS Journal Indicators site (which is linked from the Journal Metrics site) it was given as 9.61. So there are two answers to the question of what counts as big: either anything over 0.7 from Journal Metrics or anything over 3 from CWTS Journal Indicators. If anyone can help me resolve this discrepancy I’d be grateful.
Australian Research Council Ranking
The Excellence in Research for Australia (ERA) evaluation exercise in 2010 included a system in which journals were ranked A*, A, B or C. Details of what the rankings mean is here. Top journals (in fact many that might elsewhere be called middle ranking) are ranked A*. It is not clear how these rankings were decided. These journal rankings were controversial and are not being used for the 2012 ERA.
Comparison of journals using these metrics
I have selected 97 high-impact journals in biology and medicine and compiled the metrics for them. I put the list of journals together by initially picking the top journals in the field by SJR, then removing all those that only publish reviews and adding a few that seemed important, or were in the MRC frequently used journal list, or were ranked highly using other metrics. The result is in a Google spreadsheet here. I have added colours to show the top, middle and bottom 33% (tertile) in this sample of each metric for ease of visualisation, and the mean, median and percentiles are at the bottom.
Sources of data:
- IF: various websites including this for medical journals, this for Nature journals, this for Cell Press journals, this for general and evolutionary journals, this and this for a range of other journals, and individual journal websites. Please note that no data were obtained directly from Thomson Reuters, and they have asked me to state that they do not take responsibility for the accuracy of the data I am presenting.
- Google Scholar Metrics: Google Scholar Citations Top publications in English and searches from that page.
- SJR and SNIP: Journal Metrics.
- ERA: ARC.
Notes on particular journals:
A few journals have anomalous patterns, unlike most that are high or lower in all the different metrics.
- Ca: A Cancer Journal for Clinicians has a very high IF, SJR and SNIP, but comes out lower on Google Scholar metrics. A recent post in Psychology Today includes a suggestion of why this might be:
The impact factor reflects that the American Cancer Society publishes statistics in CA that are required citations for authoritative estimates of prevalence and other statistics as they vary by cancer site and this assures a high level of citation.
- A few journals rank relatively low (out of this selection of journals) on all the metrics except the ERA rating, where they are rated A*: Development, The Journal of Immunology, Cellular Microbiology, Journal of Biological Chemistry, Molecular Microbiology, Developmental Biology, and Genetics. I don’t know why this might be, except that the ERA ratings appear to be subjective decisions by experts rather than being based on citations.
- Proc Natl Acad Sci USA, Blood, Nucleic Acids Research, Cancer Research, Gastroenterology and most notably the BMJ come out high in Google Scholar Metrics but not so high in IF, SJR and SNIP. Perhaps they are journals that publish many papers, which is not accounted for by Google Scholar Metrics, or they could have more citations four or five years after papers are published, which would be picked up by Google Scholar Metrics but not the other metrics.
- Finally, Systematic Biology has a high SNIP, a medium SJR and IF and a lower Google Scholar Metric. Perhaps it is in a field in which citations per paper are usually low, which is accounted for by the SNIP.
Do you have experience of the lesser-known metrics being used by journals or by others to evaluate journals? Can you explain any of the anomalous patterns mentioned here or for the two different values for SNIPs?