The Case Against Data

Guest post by Toby Norman.

The “Case Against Data: Part I” laid out the argument that a large amount of data used by NGOs is meaningless. Numeric comparisons like “benchmarks” or “trends” used to understand numbers in business frequently do not work well in the nonprofit sector. However we also argued that qualitative data is not enough on its own, at many levels we need quantitative data for effective decision making. The question is then, how can we use quantitative data better? How can we leverage numbers effectively to provide real insights? Part II of this blog post presents three important tools for managers on this front: segmentation, matrices, and counterfactuals.

Internal data is one of the most readily-accessible sources of information NGOs have. However while many NGOs collect internal data from clients and programs, very few use it to its full potential. As noted, it’s tempting to look for trends within internal data, for example “did school enrollments go up or down this month?” However often these trends are obscured by seasonal patterns, wider economic changes, or changing demographics. A better strategy in these cases is to segment the data. Segmentation means taking a batch of data points, e.g. school enrollments, and dividing them into a number of equal subgroups. Typically this is done in quartiles (25%) or quintiles (20%). For example, imagine we have data on monthly school enrollments from a 100 different branch offices of the same NGO. We then group the branch offices by quartiles of performance, those with the top 25% enrollment figures are the 1^st quartile, those with the lowest 25% are the fourth quartile. Now when we compare enrollment figures over time we’re not just comparing the aggregate data, we’re comparing quartiles against each other. How do the top performers compare to the bottom performers? Is it roughly the same branch offices in each quartile month after month, or do they fluctuate every round? And critically—what characteristics do the consistent top performers seem to share?

What’s powerful about segmentation is that data is far less affected by external trends. If TB cases are on the decline, all of your health workers will be reporting fewer case finds. But your best health workers will still be finding more cases than your weak performers. This allows managers to accurately gauge what’s possible, and spend their energy strategizing how to get the rest of the organization to perform like the top quartile. And managers should not underestimate the magnitude of potential differences between segments. The well-documented phenomenon of the “Pareto” principal or 80/20 rule applies just as much to NGOs as it does to businesses. Drawn from Vilfredo Pareto’s work, it stipulates the curious trend that 80% of any outcome often comes from 20% of the causes. For example, research has shown that 80% of a business’s sales are generated by 20% of its sales force, 80% of a county’s wealth is controlled by 20% of its population, and even 80% of agricultural output comes from 20% of the plants. If you can figure out what makes your top 20% of workers successful, you can leverage these insights to drive massive improvements amongst the rest.

However, segmentation has its limitations. Many NGOs are seeking to fulfill multiple objectives, so a single-minded focus on measures like “more microfinance clients” or “more medicines distributed” can be counterproductive if its pursued to the detriment of quality. In these cases matrices become a powerful tool in the manager’s arsenal. A matrix is typically a 2×2 table that allows you to compare two measures (x & y) against each other. In business the most famous of these is possibly the BCG growth-share matrix, but variations abound and matrices can be highly effective in the nonprofit world. Let’s say for instance that a manager wants both more and higher quality microfinance clients . By plotting new enrollments on one axis and lower default rates on the other, he can quickly see where the “sweet spot” is—the place where quality meets quantity. From here basic segmentation applies, who are your top 20% of microfinance officers who are achieving these results and how can we get more like them? Simplicity is key here: while matrices can quickly become 3 dimensional with as many segments as you want, never underestimate the power of a quick 2×2 analysis to see what the key, blunt trends are.

Finally, one area of analysis where segmentation and matrices both fall short is analyzing overall program impact. What would have happened to a certain outcome if our program never existed? Managers can often hazard a guess by comparing outcomes for clients served by their most effective workers versus their least effective workers, but such comparisons rarely satisfy donors or policy makers. Yet here trends and benchmarks face all of the same weaknesses as discussed before. Instead what managers need to do is search for “counterfactuals” or outcomes in places where a program didn’t happen that can be compared to outcomes in places where it did. There is an extensive literature on impact evaluation, most famously the use of randomized control trials (RCTs) to evaluate aid programs, and it isn’t necessary to rehash them here.

Instead, two practical notes for managers looking for counterfactuals. First, RCTs are powerful, time-consuming, and expensive tools that are best used sparingly to answer truly fundamental questions. As BRAC authors have noted, the “pre-research” phase of focus groups, data analysis, and pilots before an RCT will typically tell practitioners what they need to know about a program’s effectiveness. The full experimental roll-out is typically more important for generating data to publish in academic journals, or to satisfy mega donors. Second, there are a range of under-utilized tools that can simulate counterfactuals very accurately without the cost and time of a full field experiment. These range from propensity score matching to regression discontinuity methods. The World Bank Handbook on Impact Evaluation is a good primer for those interested in this area, and bringing on board academic researchers may be a good idea for more complex analyses. However even without sophisticated research techniques, thoughtfully using segmentation, matrices, and counterfactuals can powerfully upgrade a manager’s analytical arsenal and get far more value out of existing data—the kind of value which can drive meaningful differences.

Toby Norman is a PhD candidate in Management Studies at Judge Business School, University of Cambridge.

Post Views: 1,478

The Case Against Data – Part II

Amanda Misiti

Leave a Reply Cancel reply