It appears that the research & innovation policy community is not the only one struggling with demonstrations of societal benefit. In recent months, the federal Liberal government unveiled two online initiatives to increase government transparency, sharing information about government activities and outcomes. The challenges that these two platforms face amply demonstrate the difficulty of impact assessment. Those challenges are the same ones that face the science policy community, and this post explains how the shortcomings of these online platforms might help to elucidate some potential solutions.
Shiny new websites…
First off, what are the online platforms the federal government launched, and what are they for? The more publicly visible website was the Mandate Letter Tracker launched in December—with the very promising URL www.canada.ca/Results. The tracker takes all of the commitments from the mandate letters of the federal cabinet ministers, and uses indicators to determine which are fulfilled (in either original or modified form), which are on track, which are bogged down in the weeds, and which are completely dead. The self-stated purpose of this tracker is to provide information transparently to citizens, in order for them to hold elected officials more effectively to account.
The second website, which was launched with far less public fanfare, is an online database that collates information from the Departmental Results Reports (also known as Departmental Performance Reports, or DPRs) from federal bodies. Such documents have actually been public for ages, but were buried deep in the bowels of government websites and never aggregated year over year or across departments. All in all, this new website is a serious improvement for anyone looking to undertake comparative work to understand the functioning of government.
…but with a marred finish
The common challenge facing each of these initiatives—and facing the science policy community as well—is that they just don’t make a compelling case that the work undertaken has had any real impact on the world. Examples help to illustrate the problem.
- On the DPR website, one objective for the Treasury Board Secretariat is that they are supposed to “promote value for money and results for Canadians in programs and operations,” a noble and worthy cause indeed, and one that I think a lot of Canadians would stand behind. What Canadians are less likely to stand behind, however, is the indicator used to track it: the share of federal bodies that agree that “the Secretariat provides an effective challenge function.” A cynical reading would say that the objective is to convince at least 70% of federal organizations—who depend on the Secretariat for their money to arrive—that the Treasury Board is doing a good job.
- On the Mandate Letter website, one of the commitments from the Minister of International Trade is to “improve Canadian competitiveness, create jobs and generate economic growth,” again something that many Canadians probably feel is important. The indicator to track this is that “Trade and investment boosts Canadian economic growth,” which apparently it has: jobs, GDP and exports are all up since last year. Furthermore, the government has launched several plans and strategies in this area. But are any of the government’s activities actually connected to the good news story about improving economic conditions, or is there just a favourable wind that would be blowing regardless of what the government gets up to?
Two major flaws
These examples typify the two major categories of shortcomings that seem to run throughout these impact assessment exercises (which I have found to be routinely unsatisfying in basically whatever government context I’ve seen them). In almost all cases, the impact that the program or activity is targeting is something truly valuable, something that people really do care about and would appreciate if it were improved. The first category of shortcoming is a woefully inadequate operationalization of that target into an indicator. We want our government to operate efficiently and deliver results when it spends our money, but we don’t believe for a second that one set of bureaucrats patting another set of bureaucrats on the back is at all a good way to determine whether government funds were spent wisely.
The second category is slightly different. In this case, the targeted impact is a valuable one, and the indicator chosen to track it is one that is probably more widely acceptable. The challenge in this case is that there is no plausible pathway outlined to explain how the activities (or spending, or planning) of the government actually contributed to these outcomes, positively or negatively! This polarity between the categories outlined above is a common one, and it’s similar to the challenge faced in the science policy world. There are things that are easy to measure, and then there are things that we care about—and often a wide gulf in between.
How might we polish them?
Maybe there’s a different solution to this problem. At the 2017 Canadian Science Policy Conference, one of the more inspiring panels about research impact pointed out that impact assessment is the alter ego of impact planning—a point that seems so intuitively obvious but which hides in plain sight until someone points it out. What if we used impact indicators to inform operational decisions rather than just to tell good stories about our successes? Or, to adopt the lingo of my evaluation friends, what if we used impact evaluation for instrumental and process purposes, rather than just for accountability?
Such a shift is exactly what Eric Ries discusses in his famous book, The Lean Startup. He differentiates between “actionable metrics” and “vanity metrics.” Actionable metrics are always tied to a decision, selecting a course of action from among a set of options. Vanity metrics often track the outcomes that we want to achieve, but they tell us nothing about what to do.
For instance, the profitability of a company can be achieved in many ways, notably by firing a bunch of staff, which effectively cuts current expenses at the cost of future viability. Profitability on its own also says nothing about why you were able to achieve this outcome. By contrast, the engagement rate of users with one product version versus another (A/B testing) gives a clear indication of which version of the two should be rolled out at scale. The achievement here is less sexy, because ultimately investors are probably (sadly) more interested in profit than in engaging customers, but nonetheless the actionable information is much more valuable for deciding what to do.
That’s the connection between Ries’s point and the challenge faced by governments and scientists in demonstrating impact: you need your indicators to be connected to your activities. When they aren’t, the indicators can neither guide your decision-making nor tell a compelling story about your successes. So perhaps we should shift our perspective, starting instead from the decisions that we have in front of us and how they contribute to our goals, and working from there to indicators, rather than starting from the epic narratives we wish to tell about our heroic achievements. That shift requires understanding what impacts we intend to have and that we have a reasonable idea of how our activities contribute to those impacts.
If part of our objective is to demonstrate our impact (not just manage our work better to improve impacts), we would be wise to engage the people we are trying to convince. Engaging citizens in discussions about societal impact and impact assessment would surely contribute to overcoming the important challenges outlined above: that indicators selected to track impacts just aren’t meaningful, and stories connecting those to activities just aren’t credible. Some of Ries’s more recent work suggests that people are actually quite responsive to this more engaged approach. Apparently explaining the indicators and reasoning behind decision-making can instill confidence—we need not focus exclusively on the outcomes achieved.
Note: All views expressed are those of the individual author and are not necessarily those of Science-Metrix or 1science.