Poverty Mapping Literature

	A	B	C	D	E	F
1	Title	Year	People	URL	Abstract	Countries

2	Estimating global economic well-being with unlit settlements.	2022	McCallum, I., Kyba, C.C.M., Bayas, J.C.L., Moltchanova, E., Cooper, M., Cuaresma, J.C., Pachauri, S., See, L., Danylo, O., Moorthy, I. and Lesiv, M.	https://www.nature.com/articles/s41467-022-30099-9	It is well established that nighttime radiance, measured from satellites, correlates with economic prosperity across the globe. In developing countries, areas with low levels of detected radiance generally indicate limited development – with unlit areas typically being disregarded. Here we combine satellite nighttime lights and the world settlement footprint for the year 2015 to show that 19% of the total settlement footprint of the planet had no detectable artificial radiance associated with it. The majority of unlit settlement footprints are found in Africa (39%), rising to 65% if we consider only rural settlement areas, along with numerous countries in the Middle East and Asia. Significant areas of unlit settlements are also located in some developed countries. For 49 countries spread across Africa, Asia and the Americas we are able to predict and map the wealth class obtained from ~2,400,000 geo-located households based upon the percent of unlit settlements, with an overall accuracy of 87%.
3	Combining Twitter and Earth Observation Data for Local Poverty Mapping	2020	Lukas Kondmann, Matthias Haeberle and Xiao Xiang Zhu	https://elib.dlr.de/137109/1/ML4D_NeurIPS_2020_Camera_Ready.pdf	Accurate and timely data on economic development is essential for policy-makersin low- and middle-income countries where such data is often unavailable. Tofill this gap, existing approaches have used alternative data sources to proxy forlevels of local development such as satellite imagery or mobile phone data. In thispaper, we underline the power of an underrated data source for poverty mapping:Geolocated tweets. We show that the number of tweets in a region as singularinput can already explain 55 % of the variation in local wealth in Sub-SaharanAfrica with a simple Random Forest model. When nighttime light and Twitterusage information are combined as inputs to a Random Forest model they alreadyexplain 65% of the variation in local wealth which is in the range of state-of-the-artneural network architectures based on satellite images. Our results show that thenaive combination of these data sources in a random forest is already competitivein performance and more elaborate fusion approaches are a promising direction toadvance the accuracy of poverty mapping.
4	Poverty Prediction of Nigeria by using Convolutional Neural Network with Combination of Satellite Image	2020	MustaphaIbrahim Danbirni, Ren Dongxiao	https://www.researchgate.net/profile/Mustapha_Danbirni/publication/346520238_Poverty_Prediction_of_Nigeria_by_using_Convolutional_Neural_Network_with_Combination_of_Satellite_Image/links/5fc5f37f92851c3012996752/Poverty-Prediction-of-Nigeria-by-using-Convolutional-Neural-Network-with-Combination-of-Satellite-Image.pdf	While it is important that local poverty has been targeted to help predictionand its policies have been established in emerging economies, this paper assesses the potential of features of local satellite images of high resolution that accurately presents poverty and economic well-being, with a combination of convolutional neural network (CNN). The properties of items and clothing are disbursed from Nigeriansatellite images, which are used to assess poverty rates in Abuja and other regions. It includes the properties and density of buildings, shadow areas, which are the type of building height, spread, number of cars, density and length of roads, agricultural land and roofing materials. Buildings, shadows and road features have a strong relationship with poverty. Both application examples estimate estimates of the adjoining areas and estimate poverty in local areas using an artificially low census, confirming sample prediction capabilitiesother than. We have indicated that high resolution local images have the ability to change the estimate of poverty in small spaces, which have the potential for better design of surveys, and use of acquired features presents significant benefits for the methods that use satellite images to predict poverty. In this article, cconvolutionalneural networks (CNN) directly assess poverty in high-and medium-resolution satellite images.We have come to the conclusion that CNN's estimated poverty can be created end-to-end in satellite images, but much work needs to be done to understand how the educational process affects audit patterns.
5	Generating Interpretable Poverty Maps using Object Detection in Satellite Images	2020	Ayush, K., Uzkent, B., Burke, M., Lobell, D., & Ermon, S.	https://arxiv.org/pdf/2002.01612.pdf	Accurate local-level poverty measurement is an essential task for governments and humanitarian organizations to track the progress towards improving livelihoods and distribute scarce resources. Recent computer vision advances in using satellite imagery to predict poverty have shown increasing accuracy, but they do not generate features that are interpretable to policymakers, inhibiting adoption by practitioners. Here we demonstrate an interpretable computational framework to accurately predict poverty at a local level by applying object detectors to high resolution (30cm) satellite images. Using the weighted counts of objects as features, we achieve 0.539 Pearson’s r 2 in predicting village level poverty in Uganda, a 31% improvement over existing (and less interpretable) benchmarks. Feature importance and ablation analysis reveal intuitive relationships between object counts and poverty predictions. Our results suggest that interpretability does not have to come at the cost of performance, at least in this important domain.
6	Targeting Humanitarian Aid with Machine Learning and Mobile Phone Data: Evidence from an Anti-Poverty Intervention in Afghanistan	2020	Emily L. Aiken, Guadalupe Bedoya, Aidan Coville, Joshua E. Blumenstock	http://cega.berkeley.edu/wp-content/uploads/2020/04/Aiken_MeasureDev2020_paper.pdf	Recent papers demonstrate that non-traditional data, from mobile phonesand other digital sensors, can be used to roughly estimate the wealth of indi-vidual subscribers. This paper asks a question more directly relevant to de-velopment policy: Can non-traditional data be used to more efficiently targetdevelopment aid? By combining rich survey data from a “big push” anti-poverty program in Afghanistan with detailed mobile phone logs from programbeneficiaries, we study the extent to which machine learning methods can ac-curately differentiate ultra-poor households eligible for program benefits fromother poor households deemed ineligible. We show that supervised learningmethods leveraging mobile phone data can identify ultra-poor households asaccurately as standard survey-based measures of poverty, including consump-tion and wealth; and that combining survey-based measures with mobile phonedata produces classifications more accurate than those based on a single datasource. We discuss the implications and limitations of these methods for tar-geting extreme poverty in marginalized populations.
7	Using publicly available satellite imagery and deep learning to understand economic well-being in Africa	2020	Christopher Yeh, Anthony Perez, Anne Driscoll, George Azzari, Zhongyi Tang, David Lobell, Stefano Ermon & Marshall Burke	https://www.nature.com/articles/s41467-020-16185-w	Accurate and comprehensive measurements of economic well-being are fundamental inputs into both research and policy, but such measures are unavailable at a local level in many parts of the world. Here we train deep learning models to predict survey-based estimates of asset wealth across ~ 20,000 African villages from publicly-available multispectral satellite imagery. Models can explain 70% of the variation in ground-measured village wealth in countries where the model was not trained, outperforming previous benchmarks from high-resolution imagery, and comparison with independent wealth measurements from censuses suggests that errors in satellite estimates are comparable to errors in existing ground data. Satellite-based estimates can also explain up to 50% of the variation in district-aggregated changes in wealth over time, with daytime imagery particularly useful in this task. We demonstrate the utility of satellite-based estimates for research and policy, and demonstrate their scalability by creating a wealth map for Africa’s most populous country.
8	Tracking poverty using satellite imagery and big data	2019	Van Dijk, M., I. Moorthy, B. Nguyen, L. See, and S. Fritz	https://pure.iiasa.ac.at/id/eprint/16240/1/WP-19-014.pdf
9	Lightweight and Robust Representation of Economic Scales from Satellite Imagery	2019	Sungwon Han, Donghyun Ahn, Hyunji Cha, Jeasurk Yang, Sungwon Park, Meeyoung Cha	https://arxiv.org/abs/1912.08197	Satellite imagery has long been an attractive data source that provides a wealth of information on human-inhabited areas. While super resolution satellite images are rapidly becoming available, little study has focused on how to extract meaningful information about human habitation patterns and economic scales from such data. We present READ, a new approach for obtaining essential spatial representation for any given district from high-resolution satellite imagery based on deep neural networks. Our method combines transfer learning and embedded statistics to efficiently learn critical spatial characteristics of arbitrary size areas and represent them into a fixed-length vector with minimal information loss. Even with a small set of labels, READ can distinguish subtle differences between rural and urban areas and infer the degree of urbanization. An extensive evaluation demonstrates the model outperforms the state-of-the-art in predicting economic scales, such as population density for South Korea (R^2=0.9617), and shows a high potential use for developing countries where district-level economic scales are not known.
10	Socioecologically informed use of remote sensing data to predict rural household poverty	2019	Watmough, Gary R and Marcinko, Charlotte LJ and Sullivan, Clare and Tschirhart, Kevin and Mutuo, Patrick K and Palm, Cheryl A and Svenning, Jens-Christian	https://www.pnas.org/content/116/4/1213.short	Tracking the progress of the Sustainable Development Goals (SDGs) and targeting interventions requires frequent, up-to-date data on social, economic, and ecosystem conditions. Monitoring socioeconomic targets using household survey data would require census enumeration combined with annual sample surveys on consumption and socioeconomic trends. Such surveys could cost up to $253 billion globally during the lifetime of the SDGs, almost double the global development assistance budget for 2013. We examine the role that satellite data could have in monitoring progress toward reducing poverty in rural areas by asking two questions: (i) Can household wealth be predicted from satellite data? (ii) Can a socioecologically informed multilevel treatment of the satellite data increase the ability to explain variance in household wealth? We found that satellite data explained up to 62% of the variation in household level wealth in a rural area of western Kenya when using a multilevel approach. This was a 10% increase compared with previously used single-level methods, which do not consider details of spatial landscape use. The size of buildings within a family compound (homestead), amount of bare agricultural land surrounding a homestead, amount of bare ground inside the homestead, and the length of growing season were important predictor variables. Our results show that a multilevel approach linking satellite and household data allows improved mapping of homestead characteristics, local land uses, and agricultural productivity, illustrating that satellite data can support the data revolution required for monitoring SDGs, especially those related to poverty and leaving no one behind.
11	Predicting Economic Development using Geolocated Wikipedia Articles	2019	Sheehan, Evan and Meng, Chenlin and Tan, Matthew and Uzkent, Burak and Jean, Neal and Lobell, David and Burke, Marshall and Ermon, Stefano	https://arxiv.org/pdf/1905.01627.pdf	Progress on the UN Sustainable Development Goals (SDGs) is hampered by a persistent lack of data regarding key social, environmental, and economic indicators, particularly in developing countries. For example, data on poverty — the first of seventeen SDGs — is both spatially sparse and infrequently collected in Sub-Saharan Africa due to the high cost of surveys. Here we propose a novel method for estimating socioeconomic indicators using open-source, geolocated textual information from Wikipedia articles. We demonstrate that modern NLP techniques can be used to predict community-level asset wealth and education outcomes using nearby geolocated Wikipedia articles. When paired with nightlights satellite imagery, our method outperforms all previously published benchmarks for this prediction task, indicating the potential of Wikipedia to inform both researchin the social sciences and future policy decisions.
12	Learning to Interpret Satellite Images in Global Scale Using Wikipedia	2019	Uzkent, Burak and Sheehan, Evan and Meng, Chenlin and Tang, Zhongyi and Burke, Marshall and Lobell, David and Ermon, Stefano	https://arxiv.org/pdf/1905.02506.pdf	Despite recent progress in computer vision, finegrained interpretation of satellite images remains challenging because of a lack of labeled training data. To overcome this limitation, we construct a novel dataset called WikiSatNet by pairing georeferenced Wikipedia articles with satellite imagery of their corresponding locations. We then propose two strategies to learn representations of satellite images by predicting properties of the corresponding articles from the images. Leveraging this new multi-modal dataset, we can drastically reduce the quantity of human-annotated labels and time required for downstream tasks. On the recently released fMoW dataset, our pre-training strategies can boost the performance of a model pre-trained on ImageNet by up to 4.5% in F1 score.
13	Measuring social, environmental and health inequalities using deep learning and street imagery	2019	Esra Suel, John W. Polak, James E. Bennett & Majid Ezzati	https://www.nature.com/articles/s41598-019-42036-w	Cities are home to an increasing majority of the world’s population. Currently, it is difficult to track social, economic, environmental and health outcomes in cities with high spatial and temporal resolution, needed to evaluate policies regarding urban inequalities. We applied a deep learning approach to street images for measuring spatial distributions of income, education, unemployment, housing, living environment, health and crime. Our model predicts different outcomes directly from raw images without extracting intermediate user-defined features. To evaluate the performance of the approach, we first trained neural networks on a subset of images from London using ground truth data at high spatial resolution from official statistics. We then compared how trained networks separated the best-off from worst-off deciles for different outcomes in images not used in training. The best performance was achieved for quality of the living environment and mean income. Allocation was least successful for crime and self-reported health (but not objectively measured health). We also evaluated how networks trained in London predict outcomes three other major cities in the UK: Birmingham, Manchester, and Leeds. The transferability analysis showed that networks trained in London, fine-tuned with only 1% of images in other cities, achieved performances similar to ones from trained on data from target cities themselves. Our findings demonstrate that street imagery has the potential complement traditional survey-based and administrative data sources for high-resolution urban surveillance to measure inequalities and monitor the impacts of policies that aim to address them.
14	Computational Socioeconomics	2019	Jian Gaoa, Yi-Cheng Zhangd, Tao Zhoua	https://arxiv.org/pdf/1905.06166.pdf	Uncovering the structure of socioeconomic systems and timely estimation of socioeconomic status are significant for economic development. The understanding of socioeconomic processes provides foundations to quantify global economic development, to map regional industrial structure, and to infer individual socioeconomic status. In this review, we will make a brief manifesto bout a new interdisciplinary research field named Computational Socioeconomics, followed by detailed introduction about data resources, computational tools, data-driven methods, theoretical models and novel applications at multiple resolutions, including the quantification of global economic inequality and complexity, the map of regional industrial structure and urban perception, the estimation of individual socioeconomic status and demographic, and the real-time monitoring of emergent events. This review, together with pioneering works we have highlighted, will draw ncreasing interdisciplinary attentions and induce a methodological shift in future socioeconomic studies.
15	Can tracking people through phone-call data improve lives?	2019	Amy Maxmen	https://www.nature.com/articles/d41586-019-01679-5?error=cookies_not_supported&code=45bea986-60a4-45a4-ac72-11ff3b5a6321	Researchers have analysed anonymized phone records of tens of millions of people in low-income countries. Critics question whether the benefits outweigh the risks.
16	Estimating Poverty in a Fragile Context	2019	Pape, Utz Johann and Parisotto, Luca	https://openknowledge.worldbank.org/bitstream/handle/10986/31190/WPS8722.pdf?sequence=1	The High Frequency South Sudan Survey, implemented by the South Sudan National Bureau of Statistics in collaboration with the World Bank, conducted several waves of representative surveys across seven of the ten former states between 2015 and 2017. These surveys provided a long overdue update to poverty numbers in South Sudan, with the previous national poverty estimates dating as far back as 2009. The escalation and expansion of the civil conflict posed severe challenges to the planning and implementation of fieldwork. The surveys therefore capitalized on several technological and methodological innovations to establish a reliable system of data collection and obtain valid poverty estimates. Focusing on the 2016 urban-rural wave, this paper describes the design and analysis of the survey to arrive at reliable poverty estimates for South Sudan, utilizing the Rapid Consumption Methodology combined with geo-spatial data for inaccessible survey areas.
17	Big data and big cities: The promises and limitations of improved measures of urban life	2018	Glaeser, E. L.; Kominers, S. D.; Luca, M. & Naik, N.	https://onlinelibrary.wiley.com/doi/full/10.1111/ecin.12364	New, “big data” sources allow measurement of city characteristics and outcome variables at higher collection frequencies and more granular geographic scales than ever before. However, big data will not solve large urban social science questions on its own. Big urban data has the most value for the study of cities when it allows measurement of the previously opaque, or when it can be coupled with exogenous shocks to people or place. We describe a number of new urban data sources and illustrate how they can be used to improve the study and function of cities. We first show how Google Street View images can be used to predict income in New York City, suggesting that similar imagery data can be used to map wealth and poverty in previously unmeasured areas of the developing world. We then discuss how survey techniques can be improved to better measure willingness to pay for urban amenities. Finally, we explain how Internet data is being used to improve the quality of city services. (JEL R1, C8, C18)
18	Don’t forget people in the use of big data for development	2018	Blumenstock, J.	https://www.nature.com/articles/d41586-018-06215-5/
19	Estimating Economic Characteristics with Phone Data	2018	Joshua E. Blumenstock	https://www.aeaweb.org/articles?id=10.1257/pandp.20181033	Historically, economists have relied heavily on survey-based data collection to measure social and economic well-being. Here, we investigate the extent to which the "digital footprints" of an individual can be used to infer his or her socioeconomic characteristics. Using two different datasets from Afghanistan and Rwanda, we show that phone data can be used to estimate the wealth of individuals in two very different economic environments. However, we find that such models are relatively brittle, and that a model trained in one country cannot be used to estimate characteristics in another. These results suggest several promising applications and directions for future work.
20	Gender and multidimensional poverty in Nicaragua: An individual based approach	2018	Espinoza-Delgado, J. & Klasen, S.	https://www.sciencedirect.com/science/article/abs/pii/S0305750X18302079	Most existing multidimensional poverty measures, such as the global-MPI and the MPI-LA, use the household as the unit of analysis, which means that the multidimensional poverty condition of the household is equated with the multidimensional poverty condition of all its members; accordingly, these measures ignore the intra-household inequalities and are gender-insensitive. Gender equality is, however, at the center of the sustainable development, as emphasized by Goal 5 of the SDGs; therefore, individual-based measures are indispensable to track progress in reaching this Goal. We contribute to the literature on multidimensional poverty and gender inequality by proposing an individual-based multidimensional poverty measure for Nicaragua and estimate the gender gaps in the three I’s of multidimensional poverty (incidence, intensity, and inequality). Overall, we find that in Nicaragua, the gender gaps in multidimensional poverty are lower than 5%, and poverty does not seem to be feminized. However, the inequality among the multidimensionally poor is clearly feminized, especially among adults, and women are living in very intense poverty when compared to men. We also find that adding a dimension (employment, domestic work, and social protection) under which women face higher deprivation into the analysis leads to larger estimates of the incidence, intensity, and inequality of women’s poverty. Finally, we find evidence that supports earlier studies that challenge the notion that female-headed households are worse off than those led by males in terms of poverty.
21	Human mobility and socioeconomic status: Analysis of Singapore and Boston	2018	Xu, Y.; Belyi, A.; Bojic, I. & Ratti, C.	https://www.sciencedirect.com/science/article/pii/S0198971517304179	Recently, some studies have shown that human movement patterns are strongly associated with regional socioeconomic indicators such as per capita income and poverty rate. These studies, however, are limited in numbers and they have not reached a consensus on what indicators or how effectively they can possibly be used to reflect the socioeconomic characteristics of the underlying populations. In this study, we propose an analytical framework — by coupling large scale mobile phone and urban socioeconomic datasets — to better understand human mobility patterns and their relationships with travelers' socioeconomic status (SES). Six mobility indicators, which include radius of gyration, number of activity locations, activity entropy, travel diversity, k-radius of gyration, and unicity, are derived to quantify important aspects of mobile phone users' mobility characteristics. A data fusion approach is proposed to approximate, at an aggregate level, the SES of mobile phone users. Using Singapore and Boston as case studies, we compare the statistical properties of the six mobility indicators in the two cities and analyze how they vary across socioeconomic classes. The results provide a multifaceted view of the relationships between mobility and SES. Specifically, it is found that phone user groups that are generally richer tend to travel shorter in Singapore but longer in Boston. One of the potential reasons, as suggested by our analysis, is that the rich neighborhoods in the two cities are respectively central and peripheral. For three other mobility indicators that reflect the diversity of individual travel and activity patterns (i.e., number of activity locations, activity entropy, and travel diversity), we find that for both cities, phone users across different socioeconomic classes exhibit very similar characteristics. This indicates that wealth level, at least in Singapore and Boston, is not a factor that restricts how people travel around in the city. In sum, our comparative analysis suggests that the relationship between mobility and SES could vary among cities, and such relationship is influenced by the spatial arrangement of housing, employment opportunities, and human activities.
22	Mobile phone indicators and their relation to the socioeconomic organisation of cities	2018	Cottineau, C. & Vanhoof, M.	https://www.mdpi.com/2220-9964/8/1/19	Thanks to the use of geolocated big data in computational social science research, the spatial and temporal heterogeneity of human activities is increasingly being revealed. Paired with smaller and more traditional data, this opens new ways of understanding how people act and move, and how these movements crystallise into the structural patterns observed by censuses. In this article we explore the convergence between mobile phone data and more traditional socioeconomic data from the national census in French cities. We extract mobile phone indicators from six months worth of Call Detail Records (CDR) data, while census and administrative data are used to characterize the socioeconomic organisation of French cities. We address various definitions of cities and investigate how they impact the statistical relationships between mobile phone indicators, such as the number of calls or the entropy of visited cell towers, and measures of economic organisation based on census data, such as the level of deprivation, inequality and segregation. Our findings show that some mobile phone indicators relate significantly with different socioeconomic organisation of cities. However, we show that relations are sensitive to the way cities are defined and delineated. In several cases, changing the city delineation rule can change the significance and even the sign of the correlation. In general, cities delineated in a restricted way (central cores only) exhibit traces of human activity which are less related to their socioeconomic organisation than cities delineated as metropolitan areas and dispersed urban regions. View Full-Text
23	Predicting financial trouble using call data—On social capital, phone logs, and financial trouble	2018	Agarwal, R. R.; Lin, C.-C.; Chen, K.-T. & Singh, V. K.	https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0191863	An ability to understand and predict financial wellbeing for individuals is of interest to economists, policy designers, financial institutions, and the individuals themselves. According to the Nilson reports, there were more than 3 billion credit cards in use in 2013, accounting for purchases exceeding US$ 2.2 trillion, and according to the Federal Reserve report, 39% of American households were carrying credit card debt from month to month. Prior literature has connected individual financial wellbeing with social capital. However, as yet, there is limited empirical evidence connecting social interaction behavior with financial outcomes. This work reports results from one of the largest known studies connecting financial outcomes and phone-based social behavior (180,000 individuals; 2 years’ time frame; 82.2 million monthly bills, and 350 million call logs). Our methodology tackles highly imbalanced dataset, which is a pertinent problem with modelling credit risk behavior, and offers a novel hybrid method that yields improvements over, both, a traditional transaction data only approach, and an approach that uses only call data. The results pave way for better financial modelling of billions of unbanked and underbanked customers using non-traditional metrics like phone-based credit scoring.
24	Predicting population-level socio-economic characteristics using Call Detail Records (CDRs) in Sri Lanka	2018	Fernando, L.; Surendra, A.; Lokanathan, S. & Gomez, T.	https://dl.acm.org/citation.cfm?doid=3220547.3220549	Prior work has shown that mobile network big data can be used as a high-frequency alternative data source to derive proxy measures that have strong predictive capacity to estimate census and poverty data in developing countries. Given that the observations from these studies can be dependent on local context and regional characteristics, we replicate this work targeting two regions in Sri Lanka. We focus on Northern Province, a post-conflict region with a highly vulnerable population and Western Province, an urban region that has been relatively untouched by the conflict. We analyze the relationship between aggregate features related to consumption, social and mobility behaviors derived from pseudonymized mobile phone CDRs and census data associated with population-level socio-economic characteristics. We show that Northern Province exhibits different social and mobility patterns when compared to populations with similar socio-economic characteristics in Western Province, which highlights the importance of replicating prior research studies under different local contexts. We go on to develop predictive models that estimate the census features using the derived CDR features. Our results confirm the applicability of this methodology in a Sri Lankan, post-conflict setting, and highlight potential areas that need to be addressed in order to improve the accuracy of our prediction models.
25	Refining Coarse-grained Spatial Data using Auxiliary Spatial Data Sets with Various Granularities	2018	Tanaka, Y.; Iwata, T.; Tanaka, T.; Kurashima, T.; Okawa, M. & Toda, H.
26	The Silence of the Cantons: Estimating Villages Socioeconomic Status Through Mobile Phones Data	2018	Castillo, G.; Layedra, F.; Guaranda, M.-B.; Lara, P. & Vaca, C.	https://ieeexplore.ieee.org/abstract/document/8372308	The use of cellphones has deeply influenced the way how people communicate and live everyday. Because of mobile phones ubiquity, the geolocated information recorded by every activity carried on with them has been used in numerous studies in topics related to human mobility and their relation with socioeconomic indicators. Socioeconomic indicators like health, education and poverty provide insights about the welfare of a region. Subsequently, such geolocated records with their inherent fine granularity are key for a local government in order to take decisions over a region and promote their development. Analysis of CDRs to approximate these indicators has been mainly done over developed and emerging countries like India and Brazil, but there is still a lack of studies over countries in means of development. In the present study, we propose a method to predict three socioeconomic indices at a high granularity, in the context of a developing country. Our study uses the volume of mobile phones calls and SMS (Short Message Service) messages located in a province of Ecuador over different periods of time. Our results demonstrate that activities from mobile phones are an effective and accessible input for determining the economic status of a developing country's canton. We show that a high mobile phones activity frequency is linked to a population with higher incomes and education level.
27	The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions	2018	Goulding, J.; Smith, G. & Engelmann, G.	https://ieeexplore.ieee.org/abstract/document/8622268/	Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socio-economic status. The combination of mass call detail records (CDR) data with machine learning has recently been proposed as a way to obtain this data without the expense required by traditional census and household survey methods. Based on a sample of 330k mobile phone subscribers resident in Dar es Salaam, Tanzania (7.6m M-Money records, 450.2m call and SMS event logs) this paper demonstrates the improvements that can be made via an alternate data stream: M-Money transaction records. An alternative to traditional banking services, particularly utilized by citizens unable to obtain a bank account, M-Money transactions provide a currently unexplored but potentially more powerful data set held by the same telecommunication companies.Comparing directly to CDR as used in prior work the results show that M-Money provides an increase in socio-demographic classification accuracy (average F1 score) from 65.9% (0.63) to 71.3% (0.7) at much finer-grained spatial regions than previously examined. Notably, the combined use of M-Money and CDR data only increases prediction accuracy (average F1 score) from 71.3% (0.7) to 72.3% (0.71), providing evidence that M-Money is informationally subsuming CDR data. The reasons for this and the importance/contributions of individual features are subsequently investigated.
28	Will the Sustainable Development Goals be fulfilled? Assessing present and future global poverty	2018	Cuaresma, J. C.; Fengler, W.; Kharas, H.; Bekhtiar, K.; Brottrager, M. & Hofer, M.	https://www.nature.com/articles/s41599-018-0083-y	Monitoring progress towards the fulfillment of the Sustainable Development Goals (SDGs) requires the assessment of potential future trends in poverty. This paper presents an econometric tool that provides a methodological framework to carry out projections of poverty rates worldwide and aims at assessing absolute poverty changes at the global level under different scenarios. The model combines country-specific historical estimates of the distribution of income, using Beta–Lorenz curves, with projections of population changes by age and education attainment level, as well as GDP projections to provide the first set of internally consistent poverty projections for all countries of the world. Making use of demographic and economic projections developed in the context of the Intergovernmental Panel on Climate Change’s Shared Socioeconomic Pathways, we create poverty paths by country up to the year 2030. The differences implied by different global scenarios span worldwide poverty rates ranging from 4.5% (around 375 million persons) to almost 6% (over 500 million persons) by the end of our projection period. The largest differences in poverty headcount and poverty rates across scenarios appear for Sub-Saharan Africa, where the projections for the most optimistic scenario imply over 300 million individuals living in extreme poverty in 2030. The results of the comparison of poverty scenarios point towards the difficulty of fulfilling the first goal of the SDGs unless further development policy efforts are enacted.
29	Poverty and shared prosperity 2018: Piecing together the Poverty puzzle	2018	World Bank Group	http://www.worldbank.org/en/publication/poverty-and-shared-prosperity	The Poverty and Shared Prosperity series provides a global audience with the latest and most accurate estimates on trends in global poverty and shared prosperity. The 2018 edition — Piecing Together the Poverty Puzzle —broadens the ways we define and measure poverty. It presents a new measure of societal poverty, integrating the absolute concept of extreme poverty and a notion of relative poverty reflecting differences in needs across countries. It introduces a multi-dimensional poverty measure that is anchored on household consumption and the international poverty line of $1.90 per person per day but broadens the measure by including information on access to education and basic infrastructure. Finally, it investigates differences in poverty within households, including by age and gender.
30	Can Human Development be Measured with Satellite Imagery?	2017	Head, A.; Manguin, Mé.; Tran, N. & Blumenstock, J. E.	https://dl.acm.org/citation.cfm?id=3136576	In many developing country environments, it is difficult or impossible to obtain recent, reliable estimates of human development. Nationally representative household surveys, which are the standard instrument for determining development policy and priorities, are typically too expensive to collect with any regularity. Recently, however, researchers have shown the potential for remote sensing technologies to provide a possible solution to this data constraint. In particular, recent work indicates that satellite imagery can be processed with deep neural networks to accurately estimate the sub-regional distribution of wealth in sub-Saharan Africa. In this paper, we explore the extent to which the same approach--- of using convolutional neural networks to process satellite imagery--- can be used to measure a broader set of human development indicators, in a broader range of geographic contexts. Our analysis produces three main results: First, we successfully replicate prior work showing that satellite images can accurately infer a wealth-based index of poverty in sub-Saharan Africa. Second, we show that this approach can generalize to predicting poverty in other countries and continents, but that the performance is sensitive to the hyperparameters used to tune the learning algorithm. Finally, we find that this approach does not trivially generalize to predicting other measures of development such as educational attainment, access to drinking water, and a variety of health-related indicators. We discuss in detail whether these findings represent a fundamental limitation of this approach, or could be fixed through more concerted adaptations of the machine learning environment.
31	Combining disparate data sources for improved poverty prediction and mapping	2017	Neeti Pokhriyal, Damien Christophe Jacques	https://www.pnas.org/content/114/46/E9783.short	More than 330 million people are still living in extreme poverty in Africa. Timely, accurate, and spatially fine-grained baseline data are essential to determining policy in favor of reducing poverty. The potential of “Big Data” to estimate socioeconomic factors in Africa has been proven. However, most current studies are limited to using a single data source. We propose a computational framework to accurately predict the Global Multidimensional Poverty Index (MPI) at a finest spatial granularity and coverage of 552 communes in Senegal using environmental data (related to food security, economic activity, and accessibility to facilities) and call data records (capturing individualistic, spatial, and temporal aspects of people). Our framework is based on Gaussian Process regression, a Bayesian learning technique, providing uncertainty associated with predictions. We perform model selection using elastic net regularization to prevent overfitting. Our results empirically prove the superior accuracy when using disparate data (Pearson correlation of 0.91). Our approach is used to accurately predict important dimensions of poverty: health, education, and standard of living (Pearson correlation of 0.84–0.86). All predictions are validated using deprivations calculated from census. Our approach can be used to generate poverty maps frequently, and its diagnostic nature is, likely, to assist policy makers in designing better interventions for poverty eradication.
32	Constructing spatiotemporal poverty indices from big data	2017	Christopher Njuguna, Patrick McSharry	https://www.sciencedirect.com/science/article/abs/pii/S0148296316304921	Big data offers the potential to calculate timely estimates of the socioeconomic development of a region. Mobile telephone activity provides an enormous wealth of information that can be utilized alongside household surveys. Estimates of poverty and wealth rely on the calculation of features from call detail records (CDRs), however, mobile network operators are reluctant to provide access to CDRs due to commercial and privacy concerns. As a compromise, this study shows that a sparse CDR dataset combined with other publicly available datasets based on satellite imagery can yield competitive results. In particular, a model is built using two CDR-based features, mobile ownership per capita and call volume per phone, combined with normalized satellite nightlight data and population density, to estimate the multi-dimensional poverty index (MPI) at the sector level in Rwanda. This model accurately estimates the MPI for sectors in Rwanda that contain mobile phone cell towers (cross-validated correlation of 0.88).
33	Data Gaps, Data Incomparability, and Data Imputation: A Review of Poverty Measurement Methods for Data-Scarce Environments	2017	Hai-Anh Dang, Dean Jolliffe, Calogero Carletto	https://elibrary.worldbank.org/doi/abs/10.1596/1813-9450-8282	This paper reviews methods that have been employed to estimate poverty in contexts where household consumption data are unavailable or missing. These contexts range from completely missing and partially missing consumption data in cross-sectional household surveys, to missing panel household data. The paper focuses on methods that aim to compare trends and dynamic patterns of poverty outcomes over time. It presents the various methods under a common framework, with pedagogical discussion on the intuition. Empirical illustrations are provided using several rounds of household survey data from Vietnam. Furthermore, the paper provides a practical guide with detailed instructions on computer programs that can be used to implement the reviewed techniques.
34	Estimating Poverty Using Cell Phone Data: Evidence from Guatemala	2017	Marco Hernandez, Lingzi Hong, Vanessa Frias-Martinez, Enrique Frias-Martinez	https://elibrary.worldbank.org/doi/abs/10.1596/1813-9450-7969	The dramatic expansion of mobile phone use in developing countries has given rise to a rich and largely untapped source of information about the characteristics of communities and regions. Call Detail Records (CDRs) obtained from cellular phones provide highly granular real-time data that can be used to assess socio-economic behavior including consumption, mobility, and social patterns. This paper examines the results of a CDR analysis focused on five administrative departments in the south west region of Guatemala, which used mobile phone data to predict observed poverty rates. Its findings indicate that CDR-based research methods have the potential to replicate the poverty estimates obtained from traditional forms of data collection, like household surveys or censuses, at a fraction of the cost. In particular, CDRs were more helpful in predicting urban and total poverty in Guatemala more accurately than rural poverty. Moreover, although the poverty estimates produced by CDR analysis do not perfectly match those generated by surveys and censuses, the results show that more comprehensive data could greatly enhance their predictive power. CDR analysis has especially promising applications in Guatemala and other developing countries, which suffer from high rates of poverty and inequality, and where limited fiscal and budgetary resources complicate the task of data collection and underscore the importance of precisely targeting public expenditures to achieve their maximum antipoverty impact.
35	Gender and poverty: what we know, don't know, and need to know for Agenda 2030	2017	Sarah Bradshaw, Sylvia Chant, Brian Linneker	https://www.tandfonline.com/doi/abs/10.1080/0966369X.2017.1395821?journalCode=cgpc20	Drawing on historical debates on gender, poverty, and the ‘feminisation of poverty’, this paper reflects on current evidence, methods and analysis of gendered poverty. It focuses on initiatives by UN Women, including the Progress of the World’s Women 2015–16. Our analysis of the data compiled by UN Women raises questions about what might account for the over-representation of women among the poor in official accounts of poverty, and how this is plausibly changing (or not) over time. The paper highlights that analysis of what is measured and how needs to be understood in relation to who is the focus of measurement. The lack of available data which is fit for purpose questions the extent to which gender poverty differences are ‘real’ or statistical. There is a continued reliance on comparing female with male headed households, and we argue the move by UN Women to adopt the notion of Female Only Households reflects available data driving conceptual understandings of women’s poverty, rather than conceptual advances driving the search for better data. Wider UN processes highlight that while sensitivity to differences among women and their subjectivities are paramount in understanding the multiple processes accounting for gender bias in poverty burdens, they are still accorded little priority. To monitor advances in Agenda 2030 will require more and better statistics. Our review suggests that we are still far from having a set of tools able to adequately measure and monitor gendered poverty.
36	Inferring personal economic status from social network location	2017	Luo, S.; Morone, F.; Sarraute, C.; Travizano, M. & Makse, H. A.	https://www.nature.com/articles/ncomms15227	It is commonly believed that patterns of social ties affect individuals’ economic status. Here we translate this concept into an operational definition at the network level, which allows us to infer the economic well-being of individuals through a measure of their location and influence in the social network. We analyse two large-scale sources: telecommunications and financial data of a whole country’s population. Our results show that an individual’s location, measured as the optimal collective influence to the structural integrity of the social network, is highly correlated with personal economic status. The observed social network patterns of influence mimic the patterns of economic inequality. For pragmatic use and validation, we carry out a marketing campaign that shows a threefold increase in response rate by targeting individuals identified by our social network metrics as compared to random targeting. Our strategy can also be useful in maximizing the effects of large-scale economic stimulus policies.
37	Mapping Big Data Solutions for the Sustainable Development Goals	2017	Lokanathan, S.; Gomez, T. & Zuhyle, S.
38	Mapping poverty using mobile phone and satellite data	2017	Jessica E. Steele, Pål Roe Sundsøy, Carla Pezzulo, Victor A. Alegana, Tomas J. Bird, Joshua Blumenstock, Johannes Bjelland, Kenth Engø-Monsen, Yves-Alexandre de Montjoye, Asif M. Iqbal, Khandakar N. Hadiuzzaman, Xin Lu, Erik Wetter, Andrew J. Tatem, Linus Bengtsson	https://royalsocietypublishing.org/doi/full/10.1098/rsif.2016.0690	Poverty is one of the most important determinants of adverse health outcomes globally, a major cause of societal instability and one of the largest causes of lost human potential. Traditional approaches to measuring and targeting poverty rely heavily on census data, which in most low- and middle-income countries (LMICs) are unavailable or out-of-date. Alternate measures are needed to complement and update estimates between censuses. This study demonstrates how public and private data sources that are commonly available for LMICs can be used to provide novel insight into the spatial distribution of poverty. We evaluate the relative value of modelling three traditional poverty measures using aggregate data from mobile operators and widely available geospatial data. Taken together, models combining these data sources provide the best predictive power (highest r2 = 0.78) and lowest error, but generally models employing mobile data only yield comparable results, offering the potential to measure poverty more frequently and at finer granularity. Stratifying models into urban and rural areas highlights the advantage of using mobile data in urban areas and different data in different contexts. The findings indicate the possibility to estimate and continually monitor poverty rates at high spatial resolution in countries with limited capacity to support traditional methods of data collection.
39	Measuring economic activity in China with mobile big data	2017	Dong, L.; Chen, S.; Cheng, Y.; Wu, Z.; Li, C. & Wu, H.	https://link.springer.com/article/10.1140/epjds/s13688-017-0125-5	Emerging trends in the use of smartphones, online mapping applications, and social media, in addition to the geo-located data they generate, provide opportunities to trace users’ socio-economic activities in an unprecedentedly granular and direct fashion and have triggered a revolution in empirical research. These vast mobile data offer new perspectives and approaches to measure economic dynamics, and they are broadening the social science and economics fields. In this paper, we explore the potential for using mobile data to measure economic activity in China from a bottom-up view. First, we build indices for gauging employment and consumer trends based on billions of geo-positioning data. Second, we advance the estimation of offline store foot traffic via location search data derived from Baidu Maps, which is then applied to predict Apple’s revenues in China and to accurately detect box-office fraud. Third, we construct consumption indicators to track trends in various service sector industries and verify them with several existing indicators. To the best of our knowledge, this is the first study to measure the world’s second-largest economy by mining such unprecedentedly large-scale and fine-granular spatial-temporal data. In this way, our research provides new approaches and insights into measuring economic activity.
40	Measuring the impact of economic well being in commuting networks--A case study of Bogota, Colombia	2017	Florez, M.; Jiang, S.; Li, R.; Mojica, C. H.; Transmilenio, S.; Rios, R. A. & González, M. C.	https://trid.trb.org/view/1438491	Big data such as call detail records (CDRs) from mobile phones are novel resources for travel demand models. An important open question is how to use them to extract practical information in relation to urban mobility, socioeconomic development, and well-being. Can we study individual mobility characteristics by income group through the lens of Big Data? In this paper, the authors present a data analysis framework that uses urban mobility extracted from CDRs, to study various characteristics of the commuting network of Bogota, Colombia, relating them to income groups by their residential location. They show that the diversity of commuting trips, defined in terms of entropy of the trips, increases with the income of the population. Further, they show that vehicle travel times during commuting hours from lower income groups clearly suffer longer congested travel times. The authors' results detail a method to use passively generated mobile phone data as a low cost alternative for transportation policies that can benefit from economic well-being measures for population with different income levels.
41	Night-time lights: A global, long term look at links to socio-economic trends	2017	Proville, J.; Zavala-Araiza, D. & Wagner, G.	https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0174610	We use a parallelized spatial analytics platform to process the twenty-one year totality of the longest-running time series of night-time lights data—the Defense Meteorological Satellite Program (DMSP) dataset—surpassing the narrower scope of prior studies to assess changes in area lit of countries globally. Doing so allows a retrospective look at the global, long-term relationships between night-time lights and a series of socio-economic indicators. We find the strongest correlations with electricity consumption, CO2 emissions, and GDP, followed by population, CH4 emissions, N2O emissions, poverty (inverse) and F-gas emissions. Relating area lit to electricity consumption shows that while a basic linear model provides a good statistical fit, regional and temporal trends are found to have a significant impact.
42	Poverty from Space: Using High-Resolution Satellite Imagery for Estimating Economic Well-Being	2017	Ryan Engstrom, Jonathan Hersh, David Newhouse	https://elibrary.worldbank.org/doi/abs/10.1596/1813-9450-8284	Can features extracted from high spatial resolution satellite imagery accurately estimate poverty and economic well-being? This paper investigates this question by extracting object and texture features from satellite images of Sri Lanka, which are used to estimate poverty rates and average log consumption for 1,291 administrative units (Grama Niladhari divisions). The features that were extracted include the number and density of buildings, prevalence of shadows, number of cars, density and length of roads, type of agriculture, roof material, and a suite of texture and spectral features calculated using a nonoverlapping box approach. A simple linear regression model, using only these inputs as explanatory variables, explains nearly 60 percent of poverty headcount rates and average log consumption. In comparison, models built using night-time lights explain only 15 percent of the variation in poverty or income. The predictions remain accurate when restricting the sample to poorer Gram Niladhari divisions. Two sample applications, extrapolating predictions into adjacent areas and estimating local area poverty using an artificially reduced census, confirm the out-of-sample predictive capabilities.
43	Poverty Prediction with Public Landsat 7 Satellite Imagery and Machine Learning	2017	Anthony Perez, Christopher Yeh, George Azzari, Marshall Burke, David Lobell, Stefano Ermon	https://arxiv.org/abs/1711.03654	Obtaining detailed and reliable data about local economic livelihoods in developing countries is expensive, and data are consequently scarce. Previous work has shown that it is possible to measure local-level economic livelihoods using high-resolution satellite imagery. However, such imagery is relatively expensive to acquire, often not updated frequently, and is mainly available for recent years. We train CNN models on free and publicly available multispectral daytime satellite images of the African continent from the Landsat 7 satellite, which has collected imagery with global coverage for almost two decades. We show that despite these images' lower resolution, we can achieve accuracies that exceed previous benchmarks.
44	Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States	2017	Gebru, T.; Krause, J.; Wang, Y.; Chen, D.; Deng, J.; Aiden, E. L. & Fei-Fei, L.	https://www.pnas.org/content/114/50/13108	The United States spends more than $250 million each year on the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed several years. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may become an increasingly practical supplement to the ACS. Here, we present a method that estimates socioeconomic characteristics of regions spanning 200 US cities by using 50 million images of street scenes gathered with Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22 million automobiles in total (8% of all automobiles in the United States), were used to accurately estimate income, race, education, and voting patterns at the zip code and precinct level. (The average US precinct contains ∼1,000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographics may effectively complement labor-intensive approaches, with the potential to measure demographics with fine spatial resolution, in close to real time.
45	Beyond the baseline: Establishing the value in mobile phone based poverty estimates	2016	Smith-Clarke, C. & Capra, L.	https://dl.acm.org/citation.cfm?id=2872427.2883076	Within the remit of `Data for Development' there have been a number of promising recent works that investigate the use of mobile phone Call Detail Records (CDRs) to estimate the spatial distribution of poverty or socio-economic status. The methods being developed have the potential to offer immense value to organisations and agencies who currently struggle to identify the poorest parts of a country, due to the lack of reliable and up to date survey data in certain parts of the world. However, the results of this research have thus far only been presented in isolation rather than in comparison to any alternative approach or benchmark. Consequently, the true practical value of these methods remains unknown. Here, we seek to allay this shortcoming, by proposing two baseline poverty estimators grounded on concrete usage scenarios: one that exploits correlation with population density only, to be used when no poverty data exists at all; and one that also exploits spatial autocorrelation, to be used when poverty data has been collected for a few regions within a country. We then compare the predictive performance of these baseline models with models that also include features derived from CDRs, so to establish their real added value. We present extensive analysis of the performance of all these models on data acquired for two developing countries -- Senegal and Ivory Coast. Our results reveal that CDR-based models do provide more accurate estimates in most cases; however, the improvement is modest and more significant when estimating (extreme) poverty intensity rates rather than mean wealth.
46	Combining satellite imagery and machine learning to predict poverty	2016	Neal Jean, Marshall Burke, Michael Xie, W. Matthew Davis, David B. Lobell, Stefano Ermon	https://science.sciencemag.org/content/353/6301/790	Reliable data on economic livelihoods remain scarce in the developing world, hampering efforts to study these outcomes and to design policies that improve them. Here we demonstrate an accurate, inexpensive, and scalable method for estimating consumption expenditure and asset wealth from high-resolution satellite imagery. Using survey and satellite data from five African countries—Nigeria, Tanzania, Uganda, Malawi, and Rwanda—we show how a convolutional neural network can be trained to identify image features that can explain up to 75% of the variation in local-level economic outcomes. Our method, which requires only publicly available data, could transform efforts to track and target poverty in developing countries. It also demonstrates how powerful machine learning techniques can be applied in a setting with limited training data, suggesting broad potential application across many scientific domains.
47	Fighting poverty with data	2016	Joshua Evan Blumenstock	https://science.sciencemag.org/content/353/6301/753	Policy-makers in the world's poorest countries are often forced to make decisions based on limited data. Consider Angola, which recently conducted its first postcolonial census. In the 44 years that elapsed between the prior census and the recent one, the country's population grew from 5.6 million to 24.3 million, and the country experienced a protracted civil war that displaced millions of citizens. In situations where reliable survey data are missing or out of date, a novel line of research offers promising alternatives. On page 790 of this issue, Jean et al. (1) apply recent advances in machine learning to high-resolution satellite imagery to accurately measure regional poverty in Africa.
48	Topic Models to Infer Socio-Economic Maps	2016	Hong, L.; Frias-Martinez, E. & Frias-Martinez, V.	https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11757	Socio-economic maps contain important information regarding the population of a country. Computing these maps is critical given that policy makers often times make important decisions based upon such information. However, the compilation of socio-economic maps requires extensive resources and becomes highly expensive. On the other hand, the ubiquitous presence of cell phones, is generating large amounts of spatiotemporal data that can reveal human behavioral traits related to specific socio-economic characteristics. Traditional inference approaches have taken advantage of these datasets to infer regional socio-economic characteristics. In this paper, we propose a novel approach whereby topic models are used to infer socio-economic levels from large-scale spatio-temporal data. Instead of using a pre-determined set of features, we use latent Dirichlet Allocation (LDA) to extract latent recurring patterns of co-occurring behaviors across regions, which are then used in the prediction of socio-economic levels. We show that our approach improves state of the art prediction results by 9%.
49	Understanding the Evidence Base for Poverty--Environment Relationships using Remotely Sensed Satellite Data: An Example from Assam, India	2016	Watmough, Gary R and Atkinson, Peter M and Saikia, Arupjyoti and Hutton, Craig W	https://www.sciencedirect.com/science/article/abs/pii/S0305750X15002533	This article presents results from an investigation of the relationships between welfare and geographic metrics from over 14,000 villages in Assam, India. Geographic metrics accounted for 61% of the variation in the lowest welfare quintile and 57% in the highest welfare quintile. Travel time to market towns, percentage of a village covered with woodland, and percentage of a village covered with winter crop were significantly related to welfare. These results support findings in the literature across a range of different developing countries. Model accuracy is unprecedented considering that the majority of geographic metrics were derived from remotely sensed data.
50	Estimating Local Poverty Measures Using Satellite Images: A Pilot Application to Central America	2015	Ben Klemens, Andrea Coppola, Max Shron	https://elibrary.worldbank.org/doi/abs/10.1596/1813-9450-7329	Several studies have used satellite measures of human activity to complement measures of economic production. This paper builds on those studies by considering satellite measures for improving poverty measures. The paper uses local-scale census and survey data from Guatemala to test at how fine a scale satellite measures are useful. Results show that supplementing survey data with satellite data leads to improvements in the estimates.
51	Household surveys in crisis	2015	Meyer, B. D.; Mok, W. K. & Sullivan, J. X.	https://ideas.repec.org/a/aea/jecper/v29y2015i4p199-226.html	Household surveys, one of the main innovations in social science research of the last century, are threatened by declining accuracy due to reduced cooperation of respondents. While many indicators of survey quality have steadily declined in recent decades, the literature has largely emphasized rising nonresponse rates rather than other potentially more important dimensions to the problem. We divide the problem into rising rates of nonresponse, imputation, and measurement error, documenting the rise in each of these threats to survey quality over the past three decades. A fundamental problem in assessing biases due to these problems in surveys is the lack of a benchmark or measure of truth, leading us to focus on the accuracy of the reporting of government transfers. We provide evidence from aggregate measures of transfer reporting as well as linked microdata. We discuss the relative importance of misreporting of program receipt and conditional amounts of benefits received, as well as some of the conjectured reasons for declining cooperation and for survey errors. We end by discussing ways to reduce the impact of the problem including the increased use of administrative data and the possibilities for combining administrative and survey data.
52	Mapping slums using spatial features in Accra, Ghana	2015	Ryan Engstrom, Avery Sandborn, Qin Yu, Jason Burgdorfer, Douglas Stow, John Weeks, Jordan Graesser	https://ieeexplore.ieee.org/abstract/document/7120494	In order to map the spatial extent and location of slum settlements multiple methodologies have been devised including remote sensing based methods and field based methods using surveys and census data. In this study we utilize spatial, structural, and contextual features (e.g., PanTex, Histogram of Oriented Gradients, Line Support Regions, Hough transforms and others) calculated at multiple spatial scales from high spatial resolution satellite data to map slum areas and compare these estimates to three field based slum maps: one from the UN Habitat/Accra Metropolitan Assembly (UNAMA) and two census data derived maps based on the UN Habitat definition of a slum, a simple slum/non-slum dichotomy map, and a slum index map. When comparing the remotely sensed derived slum areas to the UNAMA slum definition results indicate an overall accuracy of 94.3% and a Kappa of 0.91. When compared to the dichotomous, census derived slum maps the results are not as accurate. This reduced accuracy is due to the substantial over prediction of slums, especially if only one criterion was missing, using the census data. In relation to the slum index, the remote sensing estimates of slums were significantly correlated with an r2 of 0.45 and when population density was taken into account, the correlation increased to an r2 of 0.78. Overall, the remote sensing methodology provides a reasonable estimate of slum areas and variations within the city.
53	Mobile phone call data as a regional socio-economic proxy indicator	2015	Šćepanović, S.; Mishkovski, I.; Hui, P.; Nurminen, J. K. & Ylä-Jääski, A.	https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0124160	The advent of publishing anonymized call detail records opens the door for temporal and spatial human dynamics studies. Such studies, besides being useful for creating universal models for mobility patterns, could be also used for creating new socio-economic proxy indicators that will not rely only on the local or state institutions. In this paper, from the frequency of calls at different times of the day, in different small regional units (sub-prefectures) in Côte d'Ivoire, we infer users' home and work sub-prefectures. This division of users enables us to analyze different mobility and calling patterns for the different regions. We then compare how those patterns correlate to the data from other sources, such as: news for particular events in the given period, census data, economic activity, poverty index, power plants and energy grid data. Our results show high correlation in many of the cases revealing the diversity of socio-economic insights that can be inferred using only mobile phone call data. The methods and the results may be particularly relevant to policy-makers engaged in poverty reduction initiatives as they can provide an affordable tool in the context of resource-constrained developing economies, such as Côte d'Ivoire's.
54	Night-Time Light Data: A Good Proxy Measure for Economic Activity?	2015	Charlotta Mellander, José Lobo, Kevin Stolarick, Zara Matheson	https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0139779	Much research has suggested that night-time light (NTL) can be used as a proxy for a number of variables, including urbanization, density, and economic growth. As governments around the world either collect census data infrequently or are scaling back the amount of detail collected, alternate sources of population and economic information like NTL are being considered. But, just how close is the statistical relationship between NTL and economic activity at a fine-grained geographical level? This paper uses a combination of correlation analysis and geographically weighted regressions in order to examine if light can function as a proxy for economic activities at a finer level. We use a fine-grained geo-coded residential and industrial full sample micro-data set for Sweden, and match it with both radiance and saturated light emissions. We find that the correlation between NTL and economic activity is strong enough to make it a relatively good proxy for population and establishment density, but the correlation is weaker in relation to wages. In general, we find a stronger relation between light and density values, than with light and total values. We also find a closer connection between radiance light and economic activity, than with saturated light. Further, we find the link between light and economic activity, especially estimated by wages, to be slightly overestimated in large urban areas and underestimated in rural areas.
55	Predicting poverty and wealth from mobile phone metadata	2015	Joshua Blumenstock, Gabriel Cadamuro, Robert On	https://science.sciencemag.org/content/350/6264/1073	Accurate and timely estimates of population characteristics are a critical input to social and economic research and policy. In industrialized economies, novel sources of data are enabling new approaches to demographic profiling, but in developing countries, fewer sources of big data exist. We show that an individual’s past history of mobile phone use can be used to infer his or her socioeconomic status. Furthermore, we demonstrate that the predicted attributes of millions of individuals can, in turn, accurately reconstruct the distribution of wealth of an entire nation or to infer the asset distribution of microregions composed of just a few households. In resource-constrained environments where censuses and household surveys are rare, this approach creates an option for gathering localized and timely information at a fraction of the cost of traditional methods.
56	Small area model-based estimators using big data sources	2015	Marchetti, S.; Giusti, C.; Pratesi, M.; Salvati, N.; Giannotti, F.; Pedreschi, D.; Rinzivillo, S.; Pappalardo, L. & Gabrielli, L.	https://content.sciendo.com/view/journals/jos/31/2/article-p263.xml	The timely, accurate monitoring of social indicators, such as poverty or inequality, on a finegrained spatial and temporal scale is a crucial tool for understanding social phenomena and policymaking, but poses a great challenge to official statistics. This article argues that an interdisciplinary approach, combining the body of statistical research in small area estimation with the body of research in social data mining based on Big Data, can provide novel means to tackle this problem successfully. Big Data derived from the digital crumbs that humans leave behind in their daily activities are in fact providing ever more accurate proxies of social life. Social data mining from these data, coupled with advanced model-based techniques for fine-grained estimates, have the potential to provide a novel microscope through which to view and understand social complexity. This article suggests three ways to use Big Data together with small area estimation techniques, and shows how Big Data has the potential to mirror aspects of well-being and other socioeconomic phenomena.
57	Using big data to study the link between human mobility and socio-economic development	2015	Pappalardo, L.; Pedreschi, D.; Smoreda, Z. & Giannotti, F.	https://ieeexplore.ieee.org/document/7363835/	Big Data offer nowadays the potential capability of creating a digital nervous system of our society, enabling the measurement, monitoring and prediction of relevant aspects of socio-economic phenomena in quasi real time. This potential has fueled, in the last few years, a growing interest around the usage of Big Data to support official statistics in the measurement of individual and collective economic well-being. In this work we study the relations between human mobility patterns and socioeconomic development. Starting from nation-wide mobile phone data we extract a measure of mobility volume and a measure of mobility diversity for each individual. We then aggregate the mobility measures at municipality level and investigate the correlations with external socio-economic indicators independently surveyed by an official statistics institute. We find three main results. First, aggregated human mobility patterns are correlated with these socio-economic indicators. Second, the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits the strongest correlation with the external socio-economic indicators. Third, the volume of mobility and the diversity of mobility show opposite correlations with the socioeconomic indicators. Our results, validated against a null model, open an interesting perspective to study human behavior through Big Data by means of new statistical indicators that quantify and possibly "nowcast" the socio-economic development of our society.
58	Poverty on the cheap: estimating poverty maps using aggregated mobile communication networks	2014	Christopher Smith-Clarke, Afra Mashhadi, Licia Capra	https://dl.acm.org/citation.cfm?id=2557358	Governments and other organisations often rely on data collected by household surveys and censuses to identify areas in most need of regeneration and development projects. However, due to the high cost associated with the data collection process, many developing countries conduct such surveys very infrequently and include only a rather small sample of the population, thus failing to accurately capture the current socio-economic status of the country's population. In this paper, we address this problem by means of a methodology that relies on an alternative source of data from which to derive up to date poverty indicators, at a very fine level of spatio-temporal granularity. Taking two developing countries as examples, we show how to analyse the aggregated call detail records of mobile phone subscribers and extract features that are strongly correlated with poverty indexes currently derived from census data.
59	Targeting direct cash transfers to the extremely poor	2014	Abelson, B.; Varshney, K. R. & Sun, J.	https://dl.acm.org/citation.cfm?id=2623335	Unconditional cash transfers to the extreme poor via mobile telephony represent a radical, new approach to giving. GiveDirectly is a non-governmental organization (NGO) at the vanguard of delivering this proven and effective approach to reducing poverty. In this work, we streamline an important step in the operations of the NGO by developing and deploying a data-driven system for locating villages with extreme poverty in Kenya and Uganda. Using the type of roof of a home, thatched or metal, as a proxy for poverty, we develop a new remote sensing approach for selecting extremely poor villages to target for cash transfers. We develop an analytics algorithm that estimates housing quality and density in patches of publicly-available satellite imagery by learning a predictive model with sieves of template matching results combined with color histograms as features. We develop and deploy a crowdsourcing interface to obtain labeled training data. We deploy the predictive model to construct a fine-scale heat map of poverty and integrate this discovered knowledge into the processes of GiveDirectly's operations. Aggregating estimates at the village level, we produce a ranked list from which top villages are included in GiveDirectly's planned distribution of cash transfers. The automated approach increases village selection efficiency significantly.
60	Can cell phone traces measure social development	2013	Frias-Martinez, V.; Soto, V.; Virseda, J. & Frias-Martinez, E.
61	Forecasting socioeconomic trends with cell phone records	2013	Frias-Martinez, V.; Soguero-Ruiz, C.; Frias-Martinez, E. & Josephidou, M.	https://dl.acm.org/citation.cfm?id=2442902	National Statistical Institutes typically hire large numbers of enumerators to carry out periodic surveys regarding the socioeconomic status of a society. Such approach suffers from two drawbacks:(i) the survey process is expensive, especially for emerging countries that struggle with their budgets and (ii) the socioeconomic indicators are computed ex-post i.e., after socioeconomic changes have already happened. We propose the use of human behavioral patterns computed from calling records to predict future values of socioeconomic indicators. Our objective is to help institutions be able to forecast socioeconomic changes before they happen while reducing the number of surveys they need to compute. For that purpose, we explore a battery of different predictive approaches for time series and show that multivariate time-series models yield R-square values of up to 0.65 for certain socioeconomic indicators.
62	Mobile communications reveal the regional economy in Côte d’Ivoire	2013	Mao, H.; Shuai, X.; Ahn, Y.-Y. & Bollen, J.
63	Social Capital for Economic Development: Application of Time Series Cluster Analysis on Personal Network Structures	2013	Doran, D.; Klabjan, D.; Lim, B.; Mendiratta, V. & Rodriguez, M.
64	Ubiquitous sensing for mapping poverty in developing countries	2013	Smith-Clarke, C.; Mashhadi, A. & Capra, L.
65	Using Nighttime Satellite Imagery as a Proxy Measure of Human Well-Being	2013	Tilottama Ghosh, Sharolyn J. Anderson, Christopher D. Elvidge, and Paul C. Sutton	https://www.mdpi.com/2071-1050/5/12/4988	Improving human well-being is increasingly recognized as essential for movement toward a sustainable and desirable future. Estimates of different aspects of human well-being, such as Gross Domestic Product, or percentage of population with access to electric power, or measuring the distribution of income in society are often fraught with problems. There are few standardized methods of data collection; in addition, the required data is not obtained in a reliable manner and on a repetitive basis in many parts of the world. Consequently, inter-comparability of the data that does exist becomes problematic. Data derived from nighttime satellite imagery has helped develop various globally consistent proxy measures of human well-being at the gridded, sub-national, and national level. We review several ways in which nighttime satellite imagery has been used to measure the human well-being within nations.
66	Computing cost-effective census maps from cell phone traces	2012	Frias-Martinez, V.; Soto, V.; Virseda, J. & Frias-Martinez, E.
67	Prediction of socioeconomic levels using cell phone records	2011	Soto, V.; Frias-Martinez, V.; Virseda, J. & Frias-Martinez, E.	https://link.springer.com/chapter/10.1007/978-3-642-22362-4_35	The socioeconomic status of a population or an individual provides an understanding of its access to housing, education, health or basic services like water and electricity. In itself, it is also an indirect indicator of the purchasing power and as such a key element when personalizing the interaction with a customer, especially for marketing campaigns or offers of new products. In this paper we study if the information derived from the aggregated use of cell phone records can be used to identify the socioeconomic levels of a population. We present predictive models constructed with SVMs and Random Forests that use the aggregated behavioral variables of the communication antennas to predict socioeconomic levels. Our results show correct prediction rates of over 80% for an urban population of around 500,000 citizens.
68	Using luminosity data as a proxy for economic statistics	2011	Chen, X. & Nordhaus, W. D.	https://www.pnas.org/content/108/21/8589	A pervasive issue in social and environmental research has been how to improve the quality of socioeconomic data in developing countries. Given the shortcomings of standard sources, the present study examines luminosity (measures of nighttime lights visible from space) as a proxy for standard measures of output (gross domestic product). We compare output and luminosity at the country level and at the 1° latitude × 1° longitude grid-cell level for the period 1992–2008. We find that luminosity has informational value for countries with low-quality statistical systems, particularly for those countries with no recent population or economic censuses.
69	A method for estimating the relationship between phone use and wealth	2010	Blumenstock, J.; Shen, Y. & Eagle, N.
70	Acute multidimensional poverty: A new index for developing countries	2010	Alkire, S. & Santos, M. E.	https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1815243	This paper presents a new Multidimensional Poverty Index (MPI) for 104 developing countries. It is the first time multidimensional poverty is estimated using micro datasets (household surveys) for such a large number of countries which cover about 78 percent of the world's population. The MPI has the mathematical structure of one of the Alkire and Foster poverty multidimensional measures and it is composed of ten indicators corresponding to same three dimensions as the Human Development Index: Education, Health and Standard of Living. The MPI captures a set of direct deprivations that batter a person at the same time. This tool could be used to target the poorest, track the Millennium Development Goals, and design policies that directly address the interlocking deprivations poor people experience. This paper presents the methodology and components in the MPI, describes main results, and shares basic robustness tests.
71	Mobile divides: gender, socioeconomic status, and mobile phone use in Rwanda	2010	Joshua Blumenstock, Nathan Eagle	https://dl.acm.org/citation.cfm?id=2369225	We combine data from a field survey with transaction log data from a mobile phone operator to provide new insight into daily patterns of mobile phone use in Rwanda. The analysis is divided into three parts. First, we present a statistical comparison of the general Rwandan population to the population of mobile phone owners in Rwanda. We find that phone owners are considerably wealthier, better educated, and more predominantly male than the general population. Second, we analyze patterns of phone use and access, based on self-reported survey data. We note statistically significant differences by gender; for instance, women are more likely to use shared phones than men. Third, we perform a quantitative analysis of calling patterns and social network structure using mobile operator billing logs. By these measures, the differences between men and women are more modest, but we observe vast differences in utilization between the relatively rich and the relatively poor. Taken together, the evidence in this paper suggests that phones are disproportionately owned and used by the privileged strata of Rwandan society.
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100