Impact Evaluation: We have cool methods! But need to work better with others.

We were privileged to have Jack Colford of UC Berkeley deliver a 4 hour (!) session on impact evaluation in the Neuroscience building yesterday.   All in all, Dr Colford’s insight was even more impressive than the list of letters next to his name.  He ran through applied examples of evaluating the impact of global health interventions in California, Mexico, Kenya, Bangladesh and India, to name a few (‘where hasn’t he done/run a study?’ was one of my unanswered questions).

If you missed it (or forgot a pen), some highlights of his insight included:

1. Stepped Wedge randomized controlled trials: In this design, communities are randomized to receive the intervention at each time interval, x. At x0, no community has it, at x1, some number of communities get ‘treated’, at x2, some more communities get it, and so on. In this way, all communities eventually get the intervention.  Further, all communities get to exist as both controls and treated groups at different time points (familiar to a case crossover?!), enabling some particularly interesting analyses (such as separating the effect in treatment compliers from communities who would have seen the effect regardless of treatment!)

2. Different options for randomization when you know baseline characteristics are dissimilar:
-Match similar pairs, then randomize 1 partner to receive the intervention, and 1 to be control
-‘Big stick’ randomization: let a computer create 10000’s of randomization sequences, and flag the sequences that meet your pre-specified ‘acceptable’ criterion for similar distributions of characteristics.  Of the resulting possible randomizations, you randomly choose 1.

3.  Retrospective impact evaluation: Sometimes (most of the time!), the program is already in place so it is not possible to implement it randomly or using stepped-wedge. In this case, communities can be assigned propensity scores, i.e. the probability of having received the intervention (which is based on a set of covariates you derive from various sources). These communities are compared against controls with similar propensity scores who happened to miss out on the intervention.  So you’re essentially ‘mimicking’ randomization in hindsight.

4. Collaboration is essential but flawed: Partners (NGOs, funders, governments and otherwise) are highly necessary, but it seems that differing reward systems hinder these collaborations’ potential:
-People at the World Bank are more highly rewarded for getting something done then actually doing something.
-People in NGOs are more highly rewarded for doing something then evaluating whether it was done right
-We, researchers, are more highly rewarded for publishing whether something was done right than disseminating this  vital information to stakeholders.
In other words, the system is a series of contra-indications (let me guess – as if the whole word wasn’t enough, it’s on us to fix this one too).

For more on Jack Colford’s work and methods:

Jack Colford
Paper illustrating stepped wedge RCT design
Methodological paper on estimating Complier Average Causal Effect in stepped wedge design
-Paper illustrating matching when intervention already exists
Methodological paper on matching when intervention already exists

Investigative Academia

The voice on the other end of a phone call said he was a nice guy. ‘Nice’ is rarely good enough to prompt international negotiations with enemies to plot a hostage rescue. But it was enough in this case because the lawyer on the other end of the phone saw glimpses of himself in the kidnapped young American. Fortunately, he was also in a position to snowball contacts within the legal system, the FBI and terrorist organizations themselves to find Peter Kassig. Unfortunately, the negotiations failed.*

Investigative journalism seems exciting. The story might start with just an idea. But then imagine travelling to remote (and dangerous) places to chase this idea. People tell you about people, who tell you about other people. You learn more and more about the context, the culture, and the answers to the infinite number of ‘whys’. Even if the result is only 1 good interview, a story emerges.

The fast paced, chase-the-story world of (investigative) journalism is comparable to the slow paced, prove-the-story world of (quantitative) epidemiology. They both cater to curious people who can’t stop asking ‘Why?’. But they answer their questions in different ways. A conversation between a journalist (J) and an academic (A) might go like this:

Question: Why did Ebola spread in Liberia but not in the U.S.?

A: What is the hypothesis? We need some results to (dis)prove it.
J: All we know is that it did spread faster in Liberia. We need to find the story to explain this.
A: If we run regression on Ebola rates, healthcare expenditure and other confounders, the results could prove what causes the disease spread.
J: If we interview key people in Liberia and the U.S., we can piece together the story of how the disease spread in each country.
A: It will be hard to compare data collected in Liberia versus the U.S. The results will be biased.
J: Key peoples’ words cannot always be taken for granted, so hard-to-find informants are required to get all sides to the story.
A: How would we know these peoples’ accounts are accurate?
J: We don’t. But we fact-check. How do we know if regression results make sense?
A: We don’t. But we review all the existing literature on the topic and use rigorous methods.

Both have problems. But both lack something that can be found in the other. Journalistic thinking in academic research and vice versa could only lead to work that does a better job of finding the truth.  The head of the Social Policy department at LSE described such a collaboration with the Guardian in 2013:

Perhaps a report on attempted hostage rescue is outside academic’s realm while tracing Patient 0 is outside a journalist’s. But the possibilities of drawing on each other’s thinking are intriguing.

*Full Story: