Although this guest post for the Global Anticorruption Blog is not formally part 4 in a series on Corruption among development NGOs, conceptually it certainly is part of it. For the background of this series, see part 1. The original of this post can be found here (the below is a slightly edited version). The full reports underlying the series can be accessed on my publications page. Parts two and three of the series can be accessed here and here .
Why anticorruption practitioners should scrutinize and challenge research methodology
In a previous post, I described a survey used to estimate the incidence of fraud and associated problems within the Cambodian NGO sector. The response to the results of that survey have so far been somewhat disheartening—not so much because the research has had little influence on action (the fate of most such research), but rather because those who have been told about the study’s results have all taken the results for granted, questioning neither their meaningfulness nor how they were generated. Such at-face-value uptake is, paradoxically, a huge risk to the longer-term public acceptance of the evidence produced by social science research. I am relieved that methodological considerations (issues of publication bias, replicability, p-hacking, and others) are finally getting some traction within the social science community, but it is evident that the decades-long neglect of these problems dovetails with a public opinion climate that doubts and disparages social science expertise.
Lack of attention to the methodological underpinnings of “interesting” conclusions is hardly a remarkable fate for corruption research results, nor is it specific to corruption research. But the anticorruption community has a lot to lose by distrust in research, and thus a lot to win by ensuring that the findings it uses to build its cases pass basic quality checks. For the remainder of this post I’ll examine some basic questions that the Cambodia NGO corruption survey’s results should have triggered before being accepted as credible and meaningful:
- The point of using a survey (it should probably go without saying) is to make a credible estimate of some social fact; the survey data are input for an estimation process, not the estimate itself. Survey data are vulnerable to manifold well-known biases. While some of these can be ameliorated if the number of respondents is large enough, in most cases survey data must be combined with other information and interpreted though an explicit framework (partially evidence-based, partially theoretical) to derive estimates from them. Without some kind of triangulation and explicit, challengeable reasoning, data don’t become information. Executive summaries have little choice other than to concentrate on the results (rather than on all this “back-office” work). But the distinguishing characteristic of research-based estimates is the back-office work, and I wasn’t prepared for the total lack of interest my policy and practice audiences had for these methodological issues. I fear that policy-makers and practitioners (and the public) too often look to research not for evidence (that might change their minds), but for confirmation of their pre-existing opinions and agendas (and that they will ignore research that challenges those opinions and agendas).
- Even when a survey produces a reasonably credible estimate of the incidence of some phenomenon, this is still doesn’t mean very much by itself: It requires comparison to some other setting or benchmark. In the Cambodia survey, my variable of interest was the fraction of local development NGOs that had been affected by some measure of fraud during the preceding two years. Let’s say we find, in a survey like this, that the rate of fraud in development NGOs in country X is 30%. The meaning of that percentage is very different if fraud incidence in country X’s private sector is 20%, as compared to when we know it is 50%. On top of that, to make these comparisons meaningful, we have to take into account other possible differences between these settings (for example, differences in the effectiveness of the organizational and institutional checks and balances). If there are significant differences on other dimensions, the comparisons become more complicated. The Cambodia survey project did not establish clear baselines or plausible comparison groups. This is not an unusual problem; indeed, the same problem applies to much of the corruption-related survey research out there. Again, what bothered me most was not the problem itself, but the apparent total lack of interest from my audience in the issue.
The (in itself accurate) perception that these issues are “methodological” probably explains the lack of attention from research users, who are more interested in substance than method. Unfortunately, unless research users play their part as critical customers, the substantive results they are being served are much less trustworthy than they can and should be. Most corruption research is not of the purely academic variety, but explicitly aims for policy and practice relevance, and is funded by policy and practice interests and thus shaped by their agendas. In a market like this, unless research consumers play their part as responsible customers—consumers who also look at “production standards”—agenda-fit is bound to trump standards. Research consumers’ attention to basic methodological quality is a sine qua non for research to be the best it can be.
So where from here? What kind of interest would I hope for from corruption research users? What “demand” would make a difference in both methodological rigor and substantive output? By way of example: though there’s lots of survey data out there on corruption and fraud, we lack transparent translations of what we know (through systematic reviews of evidence) into probability estimates for the influence of each explanatory factor on all other factors it interacts with in the model we use to describe our understanding of a particular corrupt system. Because we always use some kind of model—usually implicit—to translate data into conclusions, why not make those models as explicit as we can and then judge their usefulness against any new available evidence that comes our way? Models are tools to make debates more productive because they invite the translation of different understandings into a common language, and they require being specific enough to be practical.
As customary, some sound candy to reward you for scrolling down: