Bibliometrics Open access
1findr: discovery for the world of research
May 2, 2018
2
, , ,

As of last week, 1science is offering public access to its 1findr service. 1findr is a discovery and analytics platform for scholarly research, indexing an incredibly wide breadth of peer-reviewed journals. But just how broad is its coverage, and how does 1findr compare to alternative systems? In this post, we’ll measure up 1findr against the (also quite new) Dimensions platform from Digital Science. These two platforms represent new approaches to bibliographic data: 1findr is fed using harvesters that automatically collect, parse, complete and validate metadata from information online, whereas Dimensions aggregates and cross-links data from a variety of sources, accessed through institutional partnerships.

Before launching too far into the discussion, I should include a disclaimer. I work at Science-Metrix, the sister company of 1science. When our friends launch a product, obviously we’re pulling for them to succeed. However, puff pieces being beyond my skill set (and even further beyond my interest), I have no intention of using this analysis simply to put “wind in their sales.”

Furthermore, as I respect my friends at 1science very much, I’ll spare them the indignity of pulling any punches. Punches like this one: 1findr does not (yet) include any citation metrics, whereas Dimensions does. That’s a pretty clear advantage, and there isn’t much more to say on that score right now. Now let’s get on with our discussion about coverage.

Volume of contents

1findr indexes 89.6 million papers from peer-reviewed journals. By comparison, Dimensions indexes 91.5 million scholarly documents, of which about 85% are journal articles, 10% are chapters and 5% are proceedings (with a token few monographs and preprints thrown in for good measure, totalling less than 0.5% of all items). All analyses presented here are based on a data extraction performed last week. The platforms evolve rapidly—checking back today I see there’s already been some movement.

In terms of open access items, 1findr includes links to about 27.5 million freely accessible journal articles, about 30% of the total. More than half of these freely downloadable full-text articles are in gold OA, about 10% in green OA and about 35% are in still unknown OA form (meaning they’re OA, but the domain through which they are available hasn’t been categorized yet). In Dimensions, about 14.4 million items are available in open access. There are no OA tags available in Dimensions. OA type tags are something that readers probably don’t worry so much about, but for analytical purposes—for instance, to inform decisions about where to publish to get the most citations, or to formulate a policy to support transformation towards OA—these kinds of tags are valuable. So 1findr has a little edge here, but its tagging of OA is incomplete.

In terms of thematic classifications, 1findr and Dimensions each have a taxonomy natively installed, and these can be used for search and analytical purposes. However, because they each use their own taxonomy and there’s no simple way to draw any correspondence between them, I can’t give a thorough comparison of thematic coverage here.

Better global coverage?

One of the major obstacles to international comparisons using traditional bibliometric databases is that the databases have a clear bias towards the English language. This means that English-speaking countries are advantaged relative to the rest of the world. For instance, one source estimated that only about one sixth of research from China is indexed in the major bibliometric databases of the West. One might hope that a totally new approach to indexing scholarly work might overcome this challenge.

One might still have a while to wait—before coverage is improved, and even before we can tell whether coverage has improved. There are several notable obstacles here. First, neither 1findr nor Dimensions offers the ability to slice data by country. Dimensions offer some AI-driven harmonization of institution names, while 1findr uses experience drawn from expertise in pattern recognition, but in both cases it’s only available in the paid version (and I haven’t seen any assessment of the quality). However, even data on large research institutions is a rough proxy for assessing geographical coverage.

At a broader level, even if the data were available, drawing conclusions would not be obvious. The best we would likely have to offer is a comparison of different data sources. There’s no obvious source for a ground truth against which to compare, to see who gets closer. We can only compare various databases to each other, document their differences, use those differences to highlight cases for in-depth study (or random sampling) and through this get some clarity on what is being measured and what is not.

1findr does offer functionality to search by language. That’s not a great proxy for geographic distribution either, of course, because so much research is now published in English by people all around the world (not in the least because research evaluations depending on Western bibliometric databases only covered English contents). With that large grain of salt in mind, just over 70% of 1findr’s contents are tagged as English, while over 15% are not tagged by language. This means that there’s at least twice as much content in English as there is in all other languages in toto.

Again, it’s hard to interpret these results without a ground truth or even other data sources against which to compare. My speculation is that this represents a big step forward toward more even coverage around the world, but that there’s still considerable progress to be made.

The history of knowledge

As noted above, 1findr concentrates on peer-reviewed journals, whereas Dimensions covers journals, books and conference proceedings. Looking only at journal articles, 1findr’s coverage since 1980 exceeds that of Dimensions by more than 20%. Comparing the coverage in Dimensions across all document types to 1findr’s coverage of journal articles, the two are neck and neck from 1980 to now: there’s a difference of about 325,000 items between them, less than a 0.5% difference.

1 findr, Dimensions, 1findr vs. Dimensions, comparison, open access, coverage, discovery platform

Figure 1

In terms of open access coverage, it’s no contest: 1findr is the clear winner, though the technology for 1findr started out as specifically hunting for OA material, so perhaps we shouldn’t be so surprised.

1 findr, Dimensions, 1findr vs. Dimensions, comparison, open access, coverage, discovery platform

Figure 2

By far my favourite part of the comparison is the longitudinal assessment. My enjoyment isn’t primarily about the recent years, though that horse race will be fun to follow. No, the best part for me is looking at the very old material indexed in these two sources. What better remedy to short-termism than to immerse oneself in the scholarly disputes of the 17th century? Both databases start out in 1665 with the Philosophical Transactions of the Royal Society of London—at a time when peer review meant something different, and more than 150 years before the word “scientist” was even coined. At this scale, the differences between the two platforms is negligible. That said, it is important to note that Dimensions has slightly more material for most years between 1665 and 1968; even considering only journal articles, Dimensions covers 9.7 million versus 8.3 million in 1findr.

1 findr, Dimensions, 1findr vs. Dimensions, comparison, open access, coverage, discovery platform

Figure 3

Final word

The 1findr and Dimensions platforms are very interesting new tools for research discovery and analytics. First of all, the sheer volume of material covered is much larger than the traditional databases. The coverage is much broader in recent years, but also reaches much further back. The coverage of scholarly material may be less biased towards English, and probably does a better job giving an even-handed consideration to different cultures and traditions. However, there’s still a lot of ground to make up there.

It’s also nice that these platforms have a freely available version. I look forward to seeing new features emerge and seeing how these new types of platforms shape the way that we conduct and assess research in the future.

Congratulations to my friends at 1science for putting together the 1findr platform. I hope that users find it valuable, and I’m sure that we’re all looking forward to the next round of updates when new features come online.

 

Note: All views expressed are those of the individual author and are not necessarily those of Science-Metrix or 1science.

0

About the author

Brooke Struck

Brooke Struck is the Senior Policy Officer at Science-Metrix in Montreal, where he puts his background in philosophy of science to good use in helping policy types and technical types to understand each other a little better every day. He also takes gleeful pleasure in unearthing our shared but buried assumptions, and generally gadfly-ing everyone in his proximity. He is interested in policy for science as well as science for policy (i.e., evidence-based decision-making), and is progressively integrating himself into the development of new bibliometric indicators at Science-Metrix to address emerging policy priorities. Before working at Science-Metrix, Brooke worked for the Canadian Federal Government. He holds a PhD in philosophy from the University of Guelph and a BA with honours in philosophy from McGill University.

Related items

/ You may check this items as well

Rationalizing the extremes: introducing the citation distribution index

The distribution of citations among the scientific...

Read more

Positional analysis: from boring tables to sweet visuals

At Science-Metrix we are obviously very focused on...

Read more

Mapping science: a guide to our Twitter series

Over the course of 2018, we’ll be publishing a s...

Read more

There are 2 comments

  • Thanks for this thoughtful and insightful review, Brooke. We’re excited to see platforms like 1findr emerge and think conversations like this can really drive forward a more open approach to changing things for the better.

    A comparison of the number of overall records is an interesting aspect, for sure, but we also need to delve into the data in a bit more depth to get a true sense of what is covered.

    A search for the term ‘CRISPR’ in both platforms reveals some of the differences in our content coverage – Dimensions returns 53,496 articles in total (25,529 OA), whilst 1findr reports 9,731 (5,363 OA).

    For 60 million out of the 92 million publication records in Dimensions, we’re able to index the fulltext. Amongst these and the remaining 32 million we have, as you say, mined over 870 million references, with the aim of supporting users in their discovery workflow.

    Crucially, the data driving Dimensions is not just publications alone (although this is what is most immediately visible in the free version). Where possible, we also link from those publications to associated grants, patents, clinical trials and Altmetric data. You can see an example of this on the page for this publication (and this too can be accessed within the free version).

    In order to provide a more robust research insights tool for institutional administrators, we also offer a license to ‘Dimensions Plus’, which enables users to search across all of the different content type, as well as providing further filters, analytical and reporting functionality.

    To put this into the context of our CRISPR search, this means that Dimensions is able to show:

    25,831 publications out of the 53,496 relevant for the ‘CRISPR’ search acknowledge funding from 1,349 funders
    the total funding amount of the 3,170 grants mentioning CRISPR is $3.3 billion,
    …out of which $1.9 billion are going to be spent in the coming years
    Already there are 3,789 patents mentioning ‘CRISPR’ that reference publications
    And 12 clinical trials, 6 of which are starting in 2018

    Creating a richer view, bringing different content types together and breaking down the barriers that researchers and institutions face was the reason we built Dimensions. We could not have done this without the support of the research community, and whilst it is not perfect, it is a huge starting point and we are making progress by the day.

    We very much welcome your comments and the open approach that you have taken here – and would suggest a more multi-faceted discussion directly between us to inform the community about the data (also beyond publications) in more facets, use cases and potential opportunities that can most benefit the work of researchers.

  • […] Roger C. Schonfeld is thinking about how to create a  truly seamless platform for academic publishing and what this means for publisher platforms whilst the 1findr service from  1science is trying to create this discovery platform using harvesters that automatically collect, parse, complete and validate metadata from information online. […]