Data mining Science policy
Data mining: Exploring the connection between innovation, growth and prosperity
September 13, 2017
, , , , , ,

In the most recent post in our ongoing data mining blog series, we explored the effect on innovation of research collaboration across disciplinary and sectoral boundaries. That topic was worth exploring because beliefs that such collaborations are effective levers to promote innovation are foundational to many policy choices, and there is scant evidence available to determine whether these levers work or not (and how powerful they are). The present post will take that line of exploration one step further: we usually promote innovation as a way to drive social and/or economic prosperity, creating “jobs and growth,” often with some qualification about these developments being “inclusive,” “smart,” or “sustainable,” or helping out “the middle class.” Such approaches have been particularly emphasized since the Financial Crisis a decade ago. The purpose of this blog post—and the case study on which it is based—is to explore the relationship between innovation and growth, especially for small and fast-growing firms.

Measuring innovation and growth

Economic policies emphasize the role of innovation as the driver of growth, and innovation policies emphasize the role of private firms as the “dynamos of innovation.” Accordingly, we decided that firms would constitute the population of our study, an approach that is particularly well suited to big data mining, given that it facilitates drilling into small units of aggregation like this (rather than examining dynamics at the regional level, say).

Are firms with high innovation capacity outperforming those with lower innovation capacity, especially when controlling for size and age? Innovation capacity in the present context was defined as output of research publications and patent applications to the EPO, as well as participation in EU-funded FP6 and FP7 projects. These were all computed on a yearly basis to track the evolution of these firms. Data related to scholarly publications were extracted from the Web of Science, while patent data were extracted from PATSTAT, and data on participation in the Framework Programmes were extracted from the CORDIS database. (An experimental innovation indicator was also developed, based on web scraping, but ultimately delivered no usable results for this case study.)

Are firms with high innovation capacity outperforming those with lower innovation capacity? Click To Tweet

In order to control for potentially confounding signals across thematic areas, European firms working in the pharmaceutical sector were selected as the population for the study. Firms were also coded by the country in which they were headquartered, in order to control for differences of national policy context. Age of firms was computed based on their incorporation date. Size of firms was defined based on the number of employees. These data, as well as financial information, were purchased from the Capital IQ database from S&P Global. Data across sources were inter-linked via company name.

Growth was measured as year-over-year changes in revenue and (separately) as year-over-year changes in number of employees. In this way, outcomes could be tracked along two dimensions that are often emphasized in the political discourse commonly used to frame discussions of innovation: innovation as generating new wealth, and innovation as driving job creation and employment—both highlighted issues in an economy with a lot of slack.

Innovation activities undertaken in a given year may not have immediate effects, and so the innovation indicators (papers, patent applications, Framework Programme participation) were measured against growth outcomes after one year and after two years, to account for delays between innovation and the benefits achieved thereby. Innovation activities were also mapped against two-year averages for growth outcomes, to mitigate the noise introduced by any normal year-over-year variance.

Fast-acting innovation: how quickly do we expect to see effects on growth? Click To Tweet

The population also had some known biases—for example, 30% of firms covered were private firms, whereas in fact about 90% of pharmaceutical firms in Europe are privately owned. Furthermore, this unbalanced coverage probably has relevant interactions with firm size, as small firms are much more likely than large ones to be among those missing from the data set. The size bias is corroborated (though not confirmed) by the data on firm size in the chemical sector across Europe: only 1% of firms in the data set had fewer than 10 employees, whereas such firms should represent about half the firms in the data set if the pharmaceutical sector follows the broader European trend of chemical manufacturing in general.

Effects of innovation on growth

Patent applications showed a positive and significant correlation with growth, in terms of both employment and revenue. These effects were even more pronounced for fast-growing firms; that is to say, firms that were increasing their workforce quickly (i.e., high growth) saw higher rates of increase if they were also patenting. As for the magnitude, a 1% increase in patent applications was associated with a 0.02% increased growth rate for employment and a 0.05% increase in revenue growth rate.

These may seem like very small effects, but considering the typical numbers of patent applications, the small numbers here can be misleading. For instance, among the fastest-growing firms, moving from 4 to 6 patent applications annually represents a 50% increase in patenting activity, which would then be associated with a 1% increase in employment growth, and a 2.5% increase in revenue growth, numbers that are perhaps more enticing from an economic perspective. If an increase of a few percent for growth still seems small, consider that there are many factors beyond innovation that also affect growth (and have not been mapped here), so expectations around the magnitude of effect should be considered in that context.

An interesting finding was that at the other end of the spectrum—among those firms that are shrinking—patenting activity had a negative association with future growth. This finding might suggest that while innovation done well contributes to growth, innovation gone wrong contributes to decline. For instance, investing in product development pays dividends only when the product is successful; the investment in a product that ends up going nowhere costs the resources invested in it. This double-edged sword of innovation raises an interesting feature that is seldom raised in policy discussions of innovation: innovation is only good for companies when they succeed at it. Furthermore, we want policies to support innovation, but preferably only those innovations that will ultimately be successful. How can good innovations be identified prospectively? How do we feel about investing public dollars into innovation knowing that some of it will yield nothing? What are our responsibilities towards those who are not winning the innovation game?

What are our responsibilities towards those who are not winning the innovation game? Click To Tweet

Publication of papers shows a pattern similar to that of patenting activity, but the effects are translated downward. That is, the benefits of publication for high-growth firms are not as intense (within the same time frame), while the detriments of publication for firms that are stumbling are even more acute than the challenges patenting presents to them. These findings may suggest that publishing scholarly articles is part of an innovation strategy that plays out over a longer term and comes with higher risk in the short term.

Participation in Framework Programme projects has a slightly negative association with growth, raising several interesting questions. If participation were coded as a binary (i.e., yes/no) variable, would the same results be obtained, or would it have a positive impact? If binary participation is positive but the scalar variable is negative, this might suggest that firms are challenged by over-extending themselves with too many projects. As with publishing papers, the question arises about whether the time scales examined here are the appropriate ones for the innovation strategy in question.

Putting these innovation activities aside momentarily, the strongest predictor discovered for growth is actually past performance: current growth is the strongest predictor discovered here for future growth, though this finding applies less to small and medium-sized enterprises (SMEs) and more to large companies. The effect of this parameter is larger than any effect detected for innovation. This finding challenges the view that innovation is a strong force of economic (and social) renewal. A firm’s current status and the direction in which it is trending have much more influence than innovation does on whether that firm flourishes or flounders, noting that innovation seems to have more influence on the fate of SMEs. The flip side of this interpretation is that large firms are entrenched and hard to displace—they do not seem to operate in a competitive world where innovation is the primary avenue to ensure that one is not left behind by evolving rivals.

The strongest predictor discovered for growth is actually past performance Click To Tweet

Bringing it all together

The previous two case studies, brought together with the present one, form a nice little narrative. Collaborative research that crosses disciplinary and sectoral boundaries is meant to promote innovation, which in turn is meant to drive economic and social prosperity. In reality, very little research seems to be taken up into innovation, and the formation of multidisciplinary and public–private research teams seems to bring about only a modest increase in that uptake. Innovation does seem to make a positive contribution to the employment and revenue growth of companies—though the study conducted here was extremely limited, both in its scope and in the quality of data available. However, innovation only makes this contribution when done well, and how to separate promising innovation from the rest is not at all clear.

Furthermore, large firms seem to continue along their same paths, regardless (to a degree) of their innovation activities—those that have performed well continue to do so, with innovation playing a far less important role in determining their outcomes. By contrast, innovation has a stronger effect for small and medium-sized enterprises, for whom past performance has a weaker influence on future outcomes.

But have these measures really gotten to the heart of the problem for which innovation was meant to provide a solution? Growth in revenues and employment are important, but presumably we hope for that prosperity to be shared communally, especially if it is driven by publicly supported innovation. Do regional or national levels of employment or wealth increase with innovation? How are job opportunities and wealth distributed amongst the individuals within the society? Disruptive innovations can create a new class of jobs to drive future employment, just as these innovating firms can drive the rest of their region or existing professions out of work.

The gains through innovation measured here have all been within the firms studied, but those gains in one firm may be offset (or more than offset) by losses in other firms and the region more generally. These concerns call for critical reflection on social narratives around innovation, which has the potential to bring about considerable change—some but not all of it for the better.



Science-Metrix’s final report for this data mining project is available from the Publications Office of the European Union.

Data Mining. Knowledge and technology flows in priority domains within the private sector and between the public and private sectors. (2017). Prepared by Science-Metrix for the European Commission. ISBN 978-92-79-68029-8; DOI 10.2777/089


All views expressed are those of the individual author and are not necessarily those of Science-Metrix or 1science.


About the author

Brooke Struck

Brooke Struck works as a policy analyst at Science-Metrix in Montreal, where he puts his background in philosophy of science to good use in helping policy types and technical types to understand each other a little better every day. He also takes gleeful pleasure in unearthing our shared but buried assumptions, and generally gadfly-ing everyone in his proximity. He is interested in policy for science as well as science for policy (i.e., evidence-based decision-making), and is progressively integrating himself into the development of new bibliometric indicators at Science-Metrix to address emerging policy priorities. Before working at Science-Metrix, Brooke worked for the Canadian Federal Government. He holds a PhD in philosophy from the University of Guelph and a BA with honours in philosophy from McGill University.

Related items

/ You may check this items as well

Using data readiness levels to address challenges in data mining projects

In a blog post from earlier this year, Neil Lawren...

Read more

Data mining: revisiting our definition

In our ongoing blog series on data mining for poli...

Read more

Data access: Vast possibilities and inequities

In our ongoing series on data mining to inform pol...

Read more

There are 0 comments