As of last week, 1science is offering public access to its 1findr service. 1findr is a discovery and analytics platform for scholarly research, indexing an incredibly wide breadth of peer-reviewed journals. But just how broad is its coverage, and how does 1findr compare to alternative systems? In this post, we’ll measure up 1findr against the (also quite new) Dimensions platform from Digital Science. These two platforms represent new approaches to bibliographic data: 1findr is fed using harvesters that automatically collect, parse, complete and validate metadata from information online, whereas Dimensions aggregates and cross-links data from a variety of sources, accessed through institutional partnerships.
Before launching too far into the discussion, I should include a disclaimer. I work at Science-Metrix, the sister company of 1science. When our friends launch a product, obviously we’re pulling for them to succeed. However, puff pieces being beyond my skill set (and even further beyond my interest), I have no intention of using this analysis simply to put “wind in their sales.”
Furthermore, as I respect my friends at 1science very much, I’ll spare them the indignity of pulling any punches. Punches like this one: 1findr does not (yet) include any citation metrics, whereas Dimensions does. That’s a pretty clear advantage, and there isn’t much more to say on that score right now. Now let’s get on with our discussion about coverage.
Volume of contents
1findr indexes 89.6 million papers from peer-reviewed journals. By comparison, Dimensions indexes 91.5 million scholarly documents, of which about 85% are journal articles, 10% are chapters and 5% are proceedings (with a token few monographs and preprints thrown in for good measure, totalling less than 0.5% of all items). All analyses presented here are based on a data extraction performed last week. The platforms evolve rapidly—checking back today I see there’s already been some movement.
In terms of open access items, 1findr includes links to about 27.5 million freely accessible journal articles, about 30% of the total. More than half of these freely downloadable full-text articles are in gold OA, about 10% in green OA and about 35% are in still unknown OA form (meaning they’re OA, but the domain through which they are available hasn’t been categorized yet). In Dimensions, about 14.4 million items are available in open access. There are no OA tags available in Dimensions. OA type tags are something that readers probably don’t worry so much about, but for analytical purposes—for instance, to inform decisions about where to publish to get the most citations, or to formulate a policy to support transformation towards OA—these kinds of tags are valuable. So 1findr has a little edge here, but its tagging of OA is incomplete.
In terms of thematic classifications, 1findr and Dimensions each have a taxonomy natively installed, and these can be used for search and analytical purposes. However, because they each use their own taxonomy and there’s no simple way to draw any correspondence between them, I can’t give a thorough comparison of thematic coverage here.
Better global coverage?
One of the major obstacles to international comparisons using traditional bibliometric databases is that the databases have a clear bias towards the English language. This means that English-speaking countries are advantaged relative to the rest of the world. For instance, one source estimated that only about one sixth of research from China is indexed in the major bibliometric databases of the West. One might hope that a totally new approach to indexing scholarly work might overcome this challenge.
One might still have a while to wait—before coverage is improved, and even before we can tell whether coverage has improved. There are several notable obstacles here. First, neither 1findr nor Dimensions offers the ability to slice data by country. Dimensions offer some AI-driven harmonization of institution names, while 1findr uses experience drawn from expertise in pattern recognition, but in both cases it’s only available in the paid version (and I haven’t seen any assessment of the quality). However, even data on large research institutions is a rough proxy for assessing geographical coverage.
At a broader level, even if the data were available, drawing conclusions would not be obvious. The best we would likely have to offer is a comparison of different data sources. There’s no obvious source for a ground truth against which to compare, to see who gets closer. We can only compare various databases to each other, document their differences, use those differences to highlight cases for in-depth study (or random sampling) and through this get some clarity on what is being measured and what is not.
1findr does offer functionality to search by language. That’s not a great proxy for geographic distribution either, of course, because so much research is now published in English by people all around the world (not in the least because research evaluations depending on Western bibliometric databases only covered English contents). With that large grain of salt in mind, just over 70% of 1findr’s contents are tagged as English, while over 15% are not tagged by language. This means that there’s at least twice as much content in English as there is in all other languages in toto.
Again, it’s hard to interpret these results without a ground truth or even other data sources against which to compare. My speculation is that this represents a big step forward toward more even coverage around the world, but that there’s still considerable progress to be made.
The history of knowledge
As noted above, 1findr concentrates on peer-reviewed journals, whereas Dimensions covers journals, books and conference proceedings. Looking only at journal articles, 1findr’s coverage since 1980 exceeds that of Dimensions by more than 20%. Comparing the coverage in Dimensions across all document types to 1findr’s coverage of journal articles, the two are neck and neck from 1980 to now: there’s a difference of about 325,000 items between them, less than a 0.5% difference.
In terms of open access coverage, it’s no contest: 1findr is the clear winner, though the technology for 1findr started out as specifically hunting for OA material, so perhaps we shouldn’t be so surprised.
By far my favourite part of the comparison is the longitudinal assessment. My enjoyment isn’t primarily about the recent years, though that horse race will be fun to follow. No, the best part for me is looking at the very old material indexed in these two sources. What better remedy to short-termism than to immerse oneself in the scholarly disputes of the 17th century? Both databases start out in 1665 with the Philosophical Transactions of the Royal Society of London—at a time when peer review meant something different, and more than 150 years before the word “scientist” was even coined. At this scale, the differences between the two platforms is negligible. That said, it is important to note that Dimensions has slightly more material for most years between 1665 and 1968; even considering only journal articles, Dimensions covers 9.7 million versus 8.3 million in 1findr.
The 1findr and Dimensions platforms are very interesting new tools for research discovery and analytics. First of all, the sheer volume of material covered is much larger than the traditional databases. The coverage is much broader in recent years, but also reaches much further back. The coverage of scholarly material may be less biased towards English, and probably does a better job giving an even-handed consideration to different cultures and traditions. However, there’s still a lot of ground to make up there.
It’s also nice that these platforms have a freely available version. I look forward to seeing new features emerge and seeing how these new types of platforms shape the way that we conduct and assess research in the future.
Congratulations to my friends at 1science for putting together the 1findr platform. I hope that users find it valuable, and I’m sure that we’re all looking forward to the next round of updates when new features come online.
Note: All views expressed are those of the individual author and are not necessarily those of Science-Metrix or 1science.