Can machine learning predict poverty?

Sophie Ochmann
Sophie Ochmann

Artificial intelligence (AI) and machine learning (ML) are buzzwords popping up everywhere: the media, technology firms (such as beating the world champion in Go), public policy, academia and, most importantly, cat videos.  

But apart from getting superhumanly good at board games, discovering the concept of “cat”, and improving advertising targeting, can these new technologies add anything to development? For instance, one big constraint for researchers and policy-makers is the lack of reliable data on poverty and wealth in developing countries. National censuses are tedious, expensive and controversial which is why many low- and middle-income countries in the past have administered nationwide surveys sparsely. The graph below illustrates this; most African countries have had at most one poverty survey in the past 10 years!

Alternatively, check out Figure 1 of Jean et al. (2016) for a picture of censuses on the African continent. For further reading, Morten Jerven has published an entire book about the incomplete and unreliable statistics in the African context.

That means that researchers and policy makers have had to rely on economic modelling to predict poverty which meant that accuracy heavily depended on the models’ assumptions. And this is where machine learning promises to be the next big thing, thanks to two recent developments. First, the amount of data from developing countries has increased dramatically thanks to cheap high-resolution satellite imagery and growing mobile phone penetration. Second, computing power has massively increased, making it possible to mine incredibly large datasets for patterns. Thanks to these recent advances in data availability and computing power, researchers have started to use big data and machine learning to predict income levels in developing countries.

The algorithm by Jean et al. (2016), for instance, learned to recognize features such as metal roofs or paved roads and associate these with higher incomes. As another example, Burke and Lobell’s (2017) algorithm managed to recognize and assess the yields of maize fields in western Kenya. Blumenstock et al. (2015) used mobile phone records to accurately predict wealth and asset distributions in Rwanda.

But how exactly did these algorithms learn that certain features in the data are good predictors of poverty or agricultural productivity?

The answer lies in the use of a second training dataset that these algorithms take as the true reality they are trying to arrive at just using satellite images or mobile phone records. Jean et al. (2016) fed the algorithm night-time satellite images (we know nocturnal light is a good predictor of income) so it could look for features in daytime images that correlated with more light at night, and thereby relative affluence. Both Burke and Lobell (2017) and Blumenstock et al. (2015) had a more tedious task: both their studies administered on-the-ground surveys asking about households’ farm yields and wealth, respectively. These were then passed to the algorithm as the information to be predicted using only the satellite photos of maize fields and mobile phone records respectively. All three papers managed to predict poverty or agricultural yields pretty well and further applications of machine learning in measuring socioeconomic outcomes spring up everywhere.

But that doesn’t mean policymakers can put away the census. Three major limitations to using machine learning to predict socioeconomic outcomes arise, all of which relate to the training data.

Concern number one: Intertemporal external validity

Because algorithms derive their predictions from what they have learned from the training dataset, they will always be best if the state of the world remains similar to the training dataset. For instance, what if an algorithm is trained on satellite images from 2010, but in 2011 a new type of crop appears? The algorithm would only be able to correctly label the types of crops that existed in 2010. In other words, as farming practices change, the visual markers correlated with affluence might change. Without regular calibration, a divergence from the world as captured in the 2010 training data would not be captured by a static algorithm.

Concern number two: Interspatial external validity

Another cause for concern is a ML-algorithm’s interspatial external validity. Returning to our example, the hypothetical algorithms were trained using datasets from African countries. The applicability of the algorithm to culturally or geographically distinct areas is limited due to crop types (and standards of living) varying by world region. For instance, an algorithm trained on maize crops in Kenya like the one by Burke & Lobell (2017) would fail to accurately predict poverty for rice farmers in Vietnam.

Joshua Blumenstock, Assistant Professor at the University of California, Berkeley, believes that AI applications for economic development are still in its infancy or what he calls a “proof of concept stage”. Before ML-derived poverty maps are given to policymakers, methods have to be developed to constantly recalibrate the algorithms at relatively low cost, feeding it independent measures of the “ground truth” to ensure its predictions are tethered to reality.

Concern number three: The high cost of obtaining training data

Ask any researcher administering a randomised controlled trial (RCT) and they can tell you about the high costs of surveys. The 2011 national census of the UK was estimated to have cost £482 million. Administering national censuses or even just surveys in developing countries will likely be even more costly due to language barriers, logistical constraints, conflict or simply the lack of capital and educated enumerators. Thereby, large costs of obtaining high quality training data can inhibit the usefulness of AI to produce better and more detailed poverty measures.

But that doesn’t mean policymakers can put away the census. Three major limitations to using machine learning to predict socioeconomic outcomes arise, all of which relate to the training data.