Bibliometrics is a complex and nuanced field, and at times, we’ll admit, it’s even a little arcane. At Science-Metrix we take great pride in explaining indicators to clients: what they mean, how they are calculated, their strengths and weaknesses, and what they can and cannot do. As with so many things, examples provide a good entry point. Inspired by this, today I begin a new occasional series that heads back to basics to explain some common indicators and how we use them in house. In this first post, I’ll explore some “fun facts” that can be derived from a bibliometric database.
Most cited papers
Let’s start with something very basic: the citation count. Citations are often used in bibliometric evaluations as a proxy to measure the “quality” of an article or the “impact” this article had on the scientific community. This usage is based on the assumption that an author citing a scientific article is indicating that they acknowledge the relevance of the previous work and that material from the cited paper has been taken up and integrated into the generation of the new knowledge presented in the citing paper. This assumption might not be accurate 100% of the time (for example, when an author is criticizing an earlier paper), but nevertheless, it is broadly accepted in the scientometric community.
So, let’s jump right in to a burning question: What is the most cited paper ever? In our in-house version of the Web of Science (WoS; produced by Clarivate Analytics), which covers papers published since 1980, the most cited paper was published in 1987, in the journal Analytical Biochemistry. It is titled “Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction” and was authored by Piotr Chomczynski and Nicoletta Sacchi. It describes a then-new method of RNA isolation that provides high yield in a short time. I am no expert in biochemistry, but my feeling as a bibliometrician is that this method was picked up by the community as the new standard for isolating RNA. To date, it has been cited by a staggering 60,655 papers (counting only articles or reviews; citations from conference proceedings or book chapters are not accounted for here). To put this number in perspective, the current average number of citations in the WoS for all papers published in biochemistry in 1987 is 39.8!
Now, I can already hear some objections that this is not fair; this article was published 30 years ago, so it has had a lot of time for citations to accrue, especially when it is compared to newer articles. Science-Metrix has developed a citation indicator to account for just for that: relative citations (RC). The RC controls for a publication’s age as well as its discipline, because the length of reference lists is known to vary between disciplines. Roughly speaking, the RC compares raw citation counts to the average citations of all articles published in the same year and in the same discipline.
So, what is the article with the highest RC score? It is an article from 2008, titled “A short history of SHELX,” published in Acta Crystallographica, Section A – Foundations of Crystallography. The article gathered 52,230 citations, which (when normalized for the publication’s age and its discipline) led to an RC score of 3352.7—that means 3,352.7 times the average citation score for publications in the same subfield and year. To appreciate the magnitude of this number, keep in mind that the average of all RC scores in the same discipline and same year is 1.0! On the surface, it looks like a lot of people really like crystals. But actually, crystallography can be used in many disciplines to determine the structure of various objects, such as new materials or biological molecules (DNA, protein, etc.). This means it’s not so surprising to see that what seems to be simply a review of the most popular crystallography software ultimately gets highly cited, because it’s relevant to a wide range of thematic areas.
“Most international” paper
Researchers are often encouraged to collaborate internationally as there is mounting evidence that international co-publications receive more citations than single-country papers. Let’s see who took this advice to heart, and who took international collaboration to the extreme. Actually, there are two papers that share the title, each including authors from 89 different countries. Both papers are published in The Lancet, in the field of clinical medicine. One of these papers was published in 2017 and is named “Worldwide trends in blood pressure from 1975 to 2015: a pooled analysis of 1479 population-based measurement studies with 19.1 million participants.” It was authored by the NCD-RisC collaboration (a network of health scientists from 200 countries coordinated by the WHO Collaborating Centre on NCD Surveillance and Epidemiology). The second article, published in 2014, is “Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013.” With 89 collaborating countries each, these two papers are well above the average of 1.18 countries per paper for the field of clinical medicine and 1.20 countries per paper for the whole WoS database.
“Most authors” on a single paper
To further explore the collaborative aspect of scientific collaboration, I identified the papers in the WoS database with the most co-authors. These are usually papers in the field of particle physics, where it is standard practice to list all the scientists, engineers and technicians who worked on the detectors, data gathering, data analysis or other parts of a given experiment. The experiments in question can be huge, such as creating a one-of-a-kind detector the size of a 4-storey building and then analyzing of data streaming from this detector. The average number of authors on a paper from the subfield nuclear & particle physics is 13.4, while the average number of authors per paper in the whole WoS database is 4.1. Getting back to the fun fact here, the record number of co-authors on a single paper reached a new height in 2015, with the publication of the article “Combined Measurement of the Higgs Boson Mass in pp Collisions at root s=7 and 8 TeV with the ATLAS and CMS Experiments” in the journal Physical Review Letters. This paper combined data from two collaborations at the Large Hadron Collider (LHC) at CERN. The ATLAS collaboration and the CMS collaboration worked together to estimate the mass of the famous Higgs Boson particle, pooling an astonishing 5,126 co-authors! For comparison, the second highest author count is 3,221, on the paper titled “Charged-particle multiplicities in pp interactions at root s=900 GeV measured with the ATLAS detector at the LHC,” published in Physics Letters B.
 The original text of this blog post omitted to mention that the database version that was used covers papers published from 1980 onward. The text has been edited to reflect this limitation.
 Puuska, H.-M., Muhonen, R., & Leino, Y. (2014). International and domestic co-publishing and their citation impact in different disciplines. Scientometrics, 98(2), pp. 823–839. doi:10.1007/s11192-013-1181-7; Inzelt, A., Schubert, A., & Schubert, M. (2009). Incremental citation impact due to international co-authorship in Hungarian higher education institutions. Scientometrics, 78(1), pp. 37–43. doi:10.1007/s11192-007-1957-8; Khor, K. A., & Yu, L.-G. (2016). Influence of international co-authorship on the research citation impact of young universities. Scientometrics, 107(3), pp. 1095–1110. doi:10.1007/s11192-016-1905-6.
Note: All views expressed are those of the individual author and are not necessarily those of Science-Metrix or 1science.