ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAP
1
NoTopicCountry/AreaInstitute/DeptData SourcesContact InformationProject DescriptionObjectiveStatisics AreaPartnershipsSDG IndicatorsData AccessData CoverageData QualityMethodologyTechnologiesOthers
2
DivisionNameEmailData ProvidersOther partnersPartnerships CommentsGoalsCommentsRelevanceData Access RightsData Access CommentsIntermeditaryIntermeditary CommentsCoverage PeriodeData CoverageFrequency CommentsCoverage Geo PopCoverage Geo CommentsCost ImplicationCost Implication CommentsValidation with Training DataValidation CommentsQuality FrameworkQuality Framework CommentsData Quality ConcernsData Quality Concern CommentsQuality Aspects EvaluatedMethod UsedDeveloped New MethodEstimation Metologycal Framework CommentsMethods CommentsTechnologies usedTechnologies Comments
3
1Feasibility study on geo-localization: using geographical data from web services for geocoding static objectsBelgiumBelgium - Statistics BelgiumSatellite imagery or aerial imagery dataCoordination administrative & big dataMarc Debusscheremarc.debusschere@economie.fgov.beStudy the feasibility of using geographical data from web services, either open (e.g. Nominatim, OpenStreetMaps) or proprietary (e.g. Google maps) for the geocoding of static objects not covered by other sources (such as Registry Office or Population Register). The objective is improved geographical localization of statistical units (for linking) and maximally-detailed geographical breakdowns in a wide range of statistical domains.Exploration,
Pilot intended to go to production to supplement existing data
Transportation, statistics
Geo-spatial statistics
-Still to be established (if no open datasets like Open street maps can be used).----To be established; data access is not an issue in the preferred solution of using open datasets.NoNot applicable.All available dataWhole country / high % of marketTo be determined; preference for free and non-proprietary open data.Not yet relevant, but benchmarking against 'ground truth' data is certainly desirable. Part of the feasibility study will consist of trying to find existing validation data sets against which to test.Not yet relevant.Not yet relevant.1. Privacy and Security
2. Completeness, Usability, Time Factors
3. Accessibility, Relevance
Not yet relevant.Not yet relevant.GISIssue still largely open at this stage.
4
2Big Data for Freight Transport and Logistics Policy MakingIndonesia, Brazil, MoroccoWorld Bank Group-T&CCordula Rastogi (Sr Transport Economist)crastogi@worldbank.orgThis project introduces substantive technical developments upon current proof of concept applications to human mobility. The CDR trace of road freight (urban delivery truck, long distance, port drayage) is distinct from that of a pedestrian or a taxi driver, and can be automatically classified (combining algorithmic and field knowledge on transportation). Hence such critical project and reform information as freight O-D matrices can be estimated from Big data, instead of costly field survey that cannot be replicated on a regular basis. The team will pilot the concept in countries where we have a strong engagement in the policy areas (transportation infrastructure, logistics, urban transport), and we have arrangements or likelihood to get the data: Indonesia (Jakarta), Brazil (Rio) and potentially Morocco.--09 - Industry, Innovation & Infrastructure
11 - Sustainable Cities & Communities
9.1, 11.2, 11aYes-----------
5
3Using mobile phone data for national, sub-national and geo-coded average prices
Nigeria, Brazil, Indonesia
World Bank GroupMobile phone dataDevelopment Data GroupNada Hamadehnhamadeh@worldbank.org---11 - Sustainable Cities & Communities 11.aYes-----------
6
4Using Big Data to Predict Student Achievement in Low-Income School SettingsVietnam, IndonesiaWorld Bank Group-EDUMichael F. CrawfordMcrawford1@worldbank.orgAccurately predicting student performance early allows mitigating interventions to be effectively designed and applied. Prediction of student achievement is therefore highly valuable to policymakers. This proposal seeks to test whether existing Learning Outcome Predicting Artificial Neural Networks (LOPANNs) can perform with the same degrees of accuracy in lower-income settings as in higher-income settings. Using large data sets from Vietnam and Indonesia, it would determine LOPANNs could reproduce the accuracy they have achieved in the US, Belgium, and Argentina.--04 - Quality Education
08 - Decent Work & Economic Growth
4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4a, 4b, 4c, 8.6Yes-----------
7
5Understanding Public Perceptions of Immunization Using Social MediaIndonesiaUN - Global Pulsehttp://www.unglobalpulse.org/immunisation-parent-perceptions-This project extracted and analyzed tweets related to vaccines and immunization in Indonesia. Findings included the identification of perception trends including concerns around religious issues, disease outbreaks, side effects and the launch of a new vaccine. This project was done in collaboration with the Ministry of Development Planning and the Ministry of Health in Indonesia, UNICEF, and WHO. (Project webpage) [ PDF ]--03 - Good Health & Well-Being
09 - Industry, Innovation & Infrastructure
17 - Partnerships for the Goals
3.8, 9c, 17.8Yes-----------
8
6Mining Citizen Feedback Data for Enhanced Local Government Decision-makingIndonesiaUN - Global Pulsehttp://www.unglobalpulse.org/projects/citizen-feedback-data-local-government-decision-making-This project deployed data analysis and visualization tools to structure and combine data from the Indonesian national citizen reporting complaint system and a local SMS based feedback system (representing active citizen complaints), together with public Twitter posts (representing passive opinions). This project was done in collaboration with the NTB Provincial Government in Indonesia.--08 - Decent Work & Economic Growth
09 - Industry, Innovation & Infrastructure
10 - Reduced Inequalities
13 - Climate Action
16 - Peace, Justice & Strong Institutions
17 - Partnerships for the Goals
8.10, 9c, 10.5, 10.6, 13.3, 16.6, 16.8, 16a, 17.8Yes-----------
9
7Using scanner data for compilation of CPIDenmarkDenmark - Statistics DenmarkScanner dataNiels Plougnpl@dst.dkThe purpose of this project is to test whether scanner data from the two major supermarket chains in Denmark can be used in the production of the CPI.Pilot intended to go to production to replace existing dataPrice statistics---Only for this project-No2014/2015All available dataWhole country / high % of marketFreeYesWe are using usual CPI data collected by 'price inspectors' to validate the data.Quality of source/inputYesAccuracy, including selectivity
Coherence, including linkability to other sources
Validity
Traditional statistical methodsOther
10
8Possible improvements of Household Budget Survey sing scanner and credit card dataSwedenSweden - Statistics SwedenCredit card data Scanner dataResearch & developmentIngegerd Janssoningegerd.jansson@scb.seStarting up work to explore what sources and methods can be used to improve HBSExplorationDemographic and social statistics---Broader access rights-No2012 (to start with)-Part of country / high % of market---YesCompleteness, Usability, Time Factors
Accuracy, including selectivity
Accessibility, Relevance
Traditional statistical methods
Other methods
NoNot yetDon't know yet
11
9Tracking Light from the Sky Version 2.0 or Monitoring Rural Electrification from SpaceIndiaWorld Bank Group-EEXKwawu Mensan Gaba
KGaba@worldbank.org
We propose a novel data-intensive strategy to improve the monitoring of electricity service provision to rural areas in India and across the developing world. We collect and analyze a unique historical archive of nighttime satellite imagery to track the supply and stability of electricity service at the local level spanning nearly 8,000 nights since 1993. Drawing upon this massive dataset and using computationally-intensive methods, our project is developing the ability to identify regional instability in power supply, increases in the frequency or incidence of power cuts, and other signatures that indicate problems with electrical service delivery, particularly in rural and remote regions where traditional monitoring is difficult.--07 - Affordable & Clean Energy
17 - Partnerships for the Goals
7.1, 7b, 17.18, 17.19
Yes-----------
12
10Using data from cell phone networks to measure rainfall in data scarce contextColombia, HaitiWorld Bank Group-Urban, Rural, and Social DevelopmentGaetano Vivo
gvivo@worldbank.org
Recent research in Europe and Africa has shown that the attenuation of the electromagnetic signal between cell phone towers which is caused by rainfall can be used for measuring precipitation and are especially useful in areas where few or no radars or rain gauges are available. The objective of this work is to modify the algorithm elaborated in other countries in order to make it applicable to the Caribbean context and integrate the obtained information with the measured data from the weather stations. This data shall then be used in the lending operations for designing risk reduction measures such as early warning systems, flood mitigation measures, design of bridges, culverts and drainage systems etc.--02 - Zero Hunger
11 - Sustainable Cities & Communities
13 - Climate Action
2.4, 11b, 13.2, 13.3, 13bYes-----------
13
11Big Data "Just in time analytics" in disaster risk management activitiesColombia, ChileWorld Bank Group-Urban, Rural, and Social DevelopmentNiels B. Holm-Nielsennholmnielsen@worldbank.orgJust in time analysis in disaster risk management (DRM) use the technological platform of big data and analytics to address the constantly changing data and "reality". This project aims to create a pilot platform based on recommendations of an earlier study which provided a need assessment and high level system architecture design on using big data for just-in-time analysis. The proposed pilot countries are Colombia and Chile (to be confirmed).--01 - No Poverty
02 - Zero Hunger
11 - Sustainable Cities & Communities
13 - Climate Action
1.5, 2.4, 11.5, 11b, 13.1
Yes-----------
14
12Big Data solutions for enhancing tax compliance
Chile, Colombia, Guatemala
World Bank Group-MFMAnne Brockmeyer and Marco Hernandezabrockmeyer@worldbank.org, marcohernandez@worldbank.orgThis project proposes a Big Data solution to increase the income tax base and boost public revenue. The idea is to leverage consumption data obtained from credit card transactions, ATM withdrawals and online purchases to construct an income proxy, and compare it to reported income from administrative tax records. Consumption data will be obtained through collaboration with the largest credit-card supplier and banking regulator for at least one country. To design an algorithm for mapping consumption measures into income proxies, using non-parametric estimation and statistical machine learning. We propose to randomly notify taxpayers of discrepancies between proxied and reported taxable income to estimate the causal effect of the program on tax payments.--17 - Partnerships for the Goals
08 - Decent Work & Economic Growth
09 - Industry, Innovation & Infrastructure
10 - Reduced Inequalities
17.1, 8.10, 9a, 10.5
Yes-----------
15
13Feasibility study on creating indicators using web scrapingEcuadorEcuador - National Institute of Statistics and CensusesWeb scraping dataDirectorate of Administrative RecordsCesar Vicunacesar_vicuna@inec.gob.ecBuild on prices published on websites, in order to make various technological and methodological exercises that can generate different types of analysis and development of indicators or indices, such as the consumer price index, based on information posted on the web.ExplorationEconomic and financial statistics---Only for this project-No2016-Part of country / high % of marketFreeYes
Official consumer price index by surveying.
Quality of output statisticsYesAccuracy, including selectivity
Validity
Accessibility, Relevance
Traditional statistical methodsYesStarting to build the methodology to generate a consumer price index onlineHadoop Clusters
Other
JAVA
16
14Use of satellite imagery to obtain geographical informationMexicoMexico - National Institute of Statistics and GeographySatellite imagery or aerial imagery dataIntegration, Analysis, and ResearchEnrique Ordazenrique.ordaz@inegi.org.mxToday in Mexico we use different types of satellite imagery to produce several kinds of data: topographical, geological, land use and geostatistical cartographyPilot intended to go to production to improve timeliness
Pilot intended to go to production to replace existing data
Agricultural statistics---Broader access rightsLicense is for all uses but only for INEGINoAll data is acquired directly from Mexican distributor2013 -2018Only a portion of all dataData for an specific area and timePart of country / high % of marketCoverage of imagery is representative for crop type.CommercialYesClassification results are verified in field.Quality of processing/throughputEvaluation is only qualitative.YesInvestigation ongoingPrivacy and Security
Completeness, Usability, Time Factors
Accuracy, including selectivity
Coherence, including linkability to other sources
Validity
Accessibility, Relevance
Data visualization methods
Traditional statistical methods
NoData visualization tools
17
15Using scanner data for price and economic statisticsRomania
Romania - National Statistics Institute
Scanner dataBogdan Oancea
bogdan.oancea@insse.ro
We intend to use scanner data for improvement of price statistics and other economic statistics indicators. The project is in the conception phase and the results will be used for developing new statistical techniques, monitoring new products, making comparisons between different regions.Exploration
Scientific / research
Economic and financial statistics
Price statistics
--------------
18
16Monitoring SDG 16 on peace and justice through Big DataTunisiaTunisia - National Institute of StatisticsWeb scraping dataDissemination, IT and CoordinationKamel ABDELLAOUIabdellaoui.kamel@ins.tnUnlock the potential of Big Data to strengthen the monitoring of at least one SDG indicator. We will start using social media and web content .Exploration
Pilot intended to go to production to improve timeliness
Pilot intended to go to production to supplement existing data
Governance statisticsSocial media provider
Intermediary big data provider
Government institute---Broader access rights-NoUntil now our team is in charge of the project. We plan to involve academia to create a common working group.last 5 yearsAll available dataWhole country / high % of marketCommercialWe are trying to access to data through big data provider who offers some tools.YesWe already did a survey related to this topic .---Machine learning (Random forest, etc.)
Data visualization methods
NoWe have not reached the data validation phase.No detail providedWe didn't have information about technologies used in the tools
19
17Log Analysis: becoming more familiar with Big Data technologiesTurkey
Turkey - Statistical Institute
OtherInformation Technologies DepartmentIlker Guvenilker.guven@tuik.gov.trWe are currently analyzing some logs being produced by different resources by using big data technologies such as Hadoop. In doing this, our goal is to become more familiar with big data technologies to get ready for possible prospective big data scenarios.Exploration
Scientific / research
Other
------No---FreeNo--No-NoHadoop Clusters
Relational database
Other
20
18How Good Are CDR-Derived Measures of Income and Inequality, and Can Governments Systematically Use Them?ColombiaWorld Bank Group-DECDGTariq Afzal Khokhar (Data Scientist)tkhokhar@worldbank.orgOfficial measures of poverty and inequality are currently produced with a multi-year time lag and have varying levels of coverage across countries. This project aims to evaluate techniques that use Call Detail Records (CDRs) to offer more timely and complete estimates of these variables. With the support of this innovation grant, the project will then explore if and how these new techniques can be incorporated into the routine work of agencies in a client country government, in this case, that of Colombia. More timely and disaggregated socio-economic measures are vital to responsive policy design and implementation.--10 - Reduced Inequalities
17 - Partnerships for the Goals
10.1, 17.18Yes-----------
21
19Using Big Data Analytics to Discover Patterns of Medical Insurance Utilization for Medical Cost Monitoring in ChinaChinaWorld Bank Group-SPLChanging Sun (Sr Economist)
csun1@worldbank.org
We propose to implement a big data analytics prototype platform that specialized in analyzing medical insurance data to monitor the insurance cost and potentially detect insurance fraud. The prototype platform includes a) medical insurance metadata repository resulted from the data integration and categorization, b) tailored predictive modeling algorithm software, c) big data visualization tools. Through the proof of concept platform, we will demonstrate the methodologies to discover utilization pattern from insurance data that could lead to cost monitoring and control.--03 - Good Health & Well-Being
08 - Decent Work & Economic Growth
3.8, 8.10Yes-----------
22
20Predicting vulnerability to flooding and enhancing resilience using big dataBangladeshWorld Bank Group-Transport and ICTIsabelle Huynh (Sr Operations Officer)ihuynh@worldbank.orgThe proposed activity draws from modeling data readily available on the Google cloud platform, including elevation, satellite imagery, and census data to dynamically refine a surface of risk within a flood prediction zone produced by weather services. The research aims at better identifying the population most at risk in case of flood, based on geographical and socio-economic data, in order to better define emergency/DRM planning.--01 - No Poverty
02 - Zero Hunger
11 - Sustainable Cities & Communities
13 - Climate Action
1.5, 2.4, 11.5, 11b, 13.1
Yes-----------
23
21Investment Lending Operation - Jamaica Energy Security and Efficiency Enhancement ProjectJamaicaWorld Bank Group-EEXMark Lambridesmlambrides@worldbank.orgIncrease energy efficiency and security through the implementation of the Borrower's National Energy Policy. In relation to the component working on the reduction of non-fuel costs I'm leading a Bank team that is collaborating with the electricity utility in Jamaica. Together we're capturing smart meter data to design a dedicated algorithm analyzing electricity consumption patterns among large commercial and industry customers, as well as to train the model with machine learning properties in order to improve the detection rates of non-technical losses that contribute to prohibiting high non-fuel costs of the electricity system in Jamaica.--
07 - Affordable & Clean Energy
7.1, 7.2, 7.3, 7bYes-----------
24
22Big Data for Financial Inclusion and Poverty MappingCongo - Democratic Republic of, Cote d'Ivoire, Ghana, Uganda, ZambiaWorld Bank Group-F&MSven HartenSHarten@ifc.orgLeveraging access to IFC clients, the project will collect data such as call detail records and (mobile) financial transaction data from Mobile Network Operators and Financial Institutions to understand customer profiles. Using sophisticated econometric techniques and cloud computing (e.g. Amazon Web Services), the project will crunch around 100,000 GB of data. The expected results will be used to determine which variables are significantly correlated to usage of (mobile) financial services to determine profiles of likely users and lists of concrete individuals who have a high likelihood score. This intelligence can then be used for product development and marketing by WBG partners to increase the supply of financial services to the previously unbanked. Through the integration of experimental evaluation techniques with several rounds of household surveys and big data collection, the team will measure the impact of using financial services on household expenditure as well as produce innovative poverty maps.--10 - Reduced Inequalities
01 - No Poverty
02 - Zero Hunger
05 - Gender Equality
08 - Decent Work & Economic Growth
10.2, 1.4, 2.3, 5a, 8.3, 8.1, 10c------------
25
23OpenRoads Philippines: Improved Real Time Decision making of Infrastructure Investments for the Philippines by linking geospatial road network data with rich geo-tagged social data collected through mobile phonesPhilippinesWorld Bank Group-GOVKai Kaiserkkaiser@worldbank.orgRoads lie very much at the heart of an effort to double the public infrastructure spending to 5% of GDP by 2016. The national government intends to fully pave/cement the national network by 2016, with the objective of poverty reduction. The major challenges to this effort are that the roads assessments of the national network need to be validated with independent data through third party monitoring, and that the data capture concerning sub-national road networks and investments needs to be rapidly improved through cost effective means. The Governance & TICT global practice teams propose to link currently available sources of authoritative data with readily available crowd-sourced geo-coded video & image data from mobile devices to validate the state of the Philippines road network. The proposed work will show how the ballooning capture of geographical referenced image/video "big data" overlaid with Open Data through data.gov.ph can be used to close the loops for the more transparent and accountable delivery of public road infrastructure.--01 - No Poverty
09 - Industry, Innovation & Infrastructure
11 - Sustainable Cities & Communities
17 - Partnerships for the Goals
1.1, 1.2, 1a, 1b, 9.1, 11.2, 11a, 17.18Yes-----------
26
24Forecasting Poverty and Shared Prosperity Using Cell Phone DataGuatemalaWorld Bank Group-MFMMarco Antonio Hernandez Oremarcohernandez@worldbank.orgCellphones generate large datasets of "digital footprints" from a population, which can be analyzed using data mining and computer-learning techniques to reveal behavioral patterns that can then be used to estimate and forecast poverty and shared prosperity. This proposal presents an affordable, practical, and scalable solution for mapping poverty based on the aggregate behavioral patterns of cellphone users, which will be piloted by the Government of Guatemala. The World Bank has initiated an innovative partnership with Telefonica Research and the iSchool of the University of Maryland to develop computer algorithms based on anonymized cellphone call records gathered by Movistar Guatemala, the country's largest cellphone provider. This initiative aims to produce detailed and reliable information at a lower cost.--01 - No Poverty1.1, 1.2, 1a, 1bYes-----------
27
25Using High Resolution Satellite Imagery and Detection Algorithms to Better Track Poverty in PakistanPakistanWorld Bank Group-POVDavid Newhousednewhouse@worldbank.orgWe propose to use high resolution satellite imagery (< 1m) and detection algorithms to improve traditional poverty mapping techniques to better measure and monitor poverty in Pakistan. Specifically, we propose to acquire multi-spectral satellite data and develop an algorithm to identify the type of roofs used in local housing (Abelson et al., 2014). The main goal is to learn a) the extent to which high-resolution data improves the accuracy of poverty predictions, and b) the extent to which changes in poverty over time are captured by these satellite derived poverty predictors.--01 - No Poverty
17 - Partnerships for the Goals
1.1, 1.2, 1a, 1b, 17.18
Yes-----------
28
26Real Time Assessments of How Markets Are Working for the PoorNigeriaWorld Bank Group-T&CAlvaro S. Gonzalezagonzalez4@worldbank.orgGovernments can act to make markets work better for the poor. The first step is to identify the size and nature of the problem. Our proposal focuses on using largely unexploited, micro-level price data, to provide policymakers with near real time capacity to assess how well markets are working for the poor. Our proposal is to combine spatially detailed and disaggregated price data with publicly available satellite lights data to geographically pinpoint markets serving poorer regions. This combined data can be used to track trends and conditions in markets and can alert policymakers of changes that may negatively affect the poor.--17 - Partnerships for the Goals17.16, 17.18, 17.19Yes-----------
29
27Evaluation of Crime and Infrastructure using Bayesian Maximum Entropy and Risk Terrain Modeling approaches in Bogotá, Colombia, 2008 - 2013ColombiaWorld Bank Group-T&ICTCamila Rodriguezcarodriguez@worldbank.orgUsing rich and robust data we intend to quantify the association of crime with specific built environment characteristics measured through street audits as well as using existing infrastructure information. Using Bayesian Maximum Entropy (BME) and Risk Terrain Modeling (RTM) we will quantify with incidence rate ratios, what specific features of the environment are associated with 6 different crimes (two against persons and four against property), as well as describe the temporal and spatial features of areas with high crime and predict which areas of the city are more likely to experience crime in the future.--16 - Peace, Justice & Strong Institutions16.4, 16a, 16.1, 16.2Yes-----------
30
28Big Data for User-focused Identification of Road Infrastructure Condition and Safety ConcernsBelarusWorld Bank Group-T&ICTWei Winnie Wangwinniewang@worldbank.orgThis proposal aims to pilot a technical innovation that will allow the determination of user-focused road condition indicators and road safety concerns by extracting information from big data collected through crowdsourcing among drivers and other road users. This collaborative approach provides wider coverage of road networks at frequent intervals and collects uniform data to support strategic and network level asset management decision making.--03 - Good Health & Well-Being
11 - Sustainable Cities & Communities
3.6, 11.2Yes-----------
31
29The Sensors are Here! A High-Resolution Application on Understanding Individual Travel Patterns in African CitiesUnited Republic of TanzaniaWorld Bank Group-Urban, Rural, and Social DevelopmentNancy Lozano Gracianlozano@worldbank.orgThe objective of this project is to collect high-resolution and high-frequency data on intra-city movements of a randomly selected group of individuals that will be interviewed as part of a planned household survey in Dar es Salaam, Tanzania. The project will combine detailed socio-economic information solicited on individuals and households as part the 3,000 household Measuring Living Standards within Dar es Salaam Survey (MLDS) with (i) follow-up phone interviews and (ii) sensor-embedded smartphone based high-frequency data collection on time- and GPS-stamped intra-city movements of a randomly selected sub-sample of MLDS respondents.--10 - Reduced Inequalities
11 - Sustainable Cities & Communities
17 - Partnerships for the Goals
10.2, 10.4, 11a, 17.18, 17.19Yes-----------
32
30Real Time Forecasting of Skills Demand and Supply: Analytics of Big Data from Babajob in IndiaIndiaWorld Bank Group-EDUShinsaku Nomurasnomura@worldbank.orgIn India, there is an online job matching platform called Babajob, which connects job seekers and employers, including in the informal sector. Babajob holds a great number of user data and job posting data, creating a Big Data of labor demand and supply. Through the utilization of the Big Data on skills demand and supply in India, the proposed project will provide real time labor market data in a visual report format and aims to foster the demand-driven skills development by better informing training providers, policy makers, employers, and job seekers.--08 - Decent Work & Economic Growth
12 - Responsible Consumption & Production
17 - Partnerships for the Goals
8.3, 8.9, 8b, 12b, 17.18, 17.19Yes-----------
33
31Big Data and the Cloud – Piloting "eHealth" for Community Reporting of Community Performance-Based Financing in GhanaGhanaWorld Bank Group-HNPFrancisca Ayodeji Akalafakala@worldbank.orgAs part of the Maternal Child Health Nutrition Improvement Project (P145792; MCHNP), the Government of Ghana will pilot a community performance-based financing (CPBF) project in 4 regions where MCHN outcomes are particularly poor. An accompanying impact evaluation (IE) will measure the effectiveness and cost-effectiveness of the project (P151684). CPBF is a novel approach whereby community health teams are incentivized to improve care seeking behavior and health outcomes of communities. Performance payments are based on monthly reported results on key MCHN indicators. Due to the inefficiency of paper-based reporting, android-based software survey tools for smartphones are proposed to report on performance directly from the community level. This Big Data platform would circumvent the time delay, capacity constraints, and data quality challenges associated with paper-based reporting. Based on this pilot, the innovation could be scaled up nationwide.--02 - Zero Hunger
03 - Good Health & Well-Being
05 - Gender Equality
2.2, 3.1, 3.2, 3.7, 3.8, 5.6Yes-----------
34
32Leveraging the private sector (e-hailing service providers) through innovative partnerships to access and utilize taxi GPS dataMexico, BrazilWorld Bank Group-T&ICTShomik Raj Mehndirattasmehndiratta@worldbank.orgOur proposal intends to use data from Easy Taxi, one of the largest e-hailing services (27 countries and 120 cities) for Mexico City, Sao Paulo and possibly Rio de Janeiro to feed an array of tools with data. We have already started negotiations with the founders and the possibility of a promising partnership is on its way. We believe leveraging such-difficult to acquire data from private service provider can be a cost-effective solution for cities in developing countries, benefiting from the possibility of easily scaling-up to other cities.--11 - Sustainable Cities & Communities
11 - Sustainable Cities & Communities
17 - Partnerships for the Goals
11.3, 11a, 17.17Yes-----------
35
33Big Data for Transport Project, Morocco (BDT)MoroccoWorld Bank Group-T&ICTVickram Cuttareevcuttaree@worldbank.orgThe proposed project will address the shortfall of data on (i) transport congestions, (ii) commuting inefficiencies, and (iii) access to public transportation in disadvantaged neighborhoods by applying Big Data analysis to a rich dataset of cellphone and mobility data. In a novel approach, the analysis will be directly linked to survey data from commuters and also unemployed. The transport constraints among non-commuters will be analyzed to quantify the potential benefit from transport investments.--11 - Sustainable Cities & Communities
08 - Decent Work & Economic Growth
11.2, 8.6, 8.8Yes-----------
36
34Enabling up-to-date and accurate authoritative country mapping with crowdsourced geospatial dataSri LankaWorld Bank Group-Urban, Rural, and Social DevelopmentMarc Fornimforni@worldbank.orgAs urban populations grow, managing urban growth in a way that fosters cities' resilience to natural hazards and the impacts of climate change requires detailed, up-to-date geographic data of the built environment. Crowdsourcing and big data such as OpenStreetMap offer an innovative, cost-effective solution. This grant will serve to equip the government of Sri Lanka with the required tools and processes to monitor new data collection in real-time and update national maps, systems that can be replicated in mapping agencies around the world.--17 - Partnerships for the Goals
11 - Sustainable Cities & Communities
13 - Climate Action
17.18, 17.19, 11.3, 11b, 13.2, 13bYes-----------
37
35Understanding Immunization Awareness and Sentiment through Analysis of Social Media and News ContentKenya, Nigeria, PakistanUN - Global Pulsehttp://www.unglobalpulse.org/sites/default/files/UNGP_ProjectSeries_Immunisation_Awareness_2015.pdfThis multicountry study analyzed perceptions about immunization from multiple social media channels and news sources in India, Kenya, Nigeria and Pakistan. The project shows how methods including sentiment analysis, topic classification and network analysis can be used to support public health workers and communication campaigns.--03 - Good Health & Well-Being
09 - Industry, Innovation & Infrastructure
17 - Partnerships for the Goals
3.8, 3b, 3d, 9c, 17.8Yes-----------
38
36Using Mobile Phone Activity For Disaster Management During FloodsMexicoUN - Global Pulsehttp://www.unglobalpulse.org/tabasco-floods-CDRsThis project combined the analysis of mobile phone activity data with remote sensing data during severe flooding in the Mexican state of Tabasco as a method to inform emergency management response. This project was done in collaboration with WFP, the Government of Mexico, the Universidad Politecnica de Madrid and Telefonica Research.--02 - Zero Hunger
15 - Life on Land
17 - Partnerships for the Goals
2.4, 15.3, 17.8Yes-----------
39
37Big Data tool to conduct automated roof counting to monitor povertyUgandaUN - Global Pulse--In the Northern Uganda region, as in many African countries, where poverty levels are high and the majority of population is rural, a proxy indicator of poverty is the type of roof at the household. As the household economy improves, families often upgrade their dwelling places by changing the type of roof from the traditional grass thatch to iron sheets. A tool is under constructions using image processing software to count the roofs and identify the type of material they are constructed from. A user-friendly tool (on-line dashboard) that provides proxies for poverty monitoring based on household' s roof counting will be built. The indicators will be built on baseline data and will be automatically refreshed every week with data aggregated at the district/county level. The indicators will be calibrated with data on poverty levels from the Uganda Bureau of Statistics.--01 - No Poverty
17 - Partnerships for the Goals
 1.2, 17.8Yes-----------
40
38Dynamic Census ProjectBangladesh, Sri Lanka, MozambiqueUniversity of TokyoMobile phone data Satellite imageryCenter for Spatial Information ScienceAyumi Araiarai@csis.u-tokyo.ac.jpDevelop a method to create a human mobility dataset by analyzing mobile phone data with various secondary data such as land use and transportation networks. For collecting training and validation data, field surveys were conducted. Results obtained through the method are called Dynamic Census that is the human trajectory data and gridded-map data, representing the spatiotemporal distribution of both mobile phone users and non-mobile phone users. The data are labeled with predicted demographic attributes. We believe it can be good supplement data for conventional population and housing census data with the information on population movement at the high granularity and high frequency. Considering that the data structure of mobile phone data does not vary a lot according to the region and country, developed method will be scalable in other parts of the world. We just finished a pilot project in Bangladesh and are preparing for Sri Lanka and Mozambique.Scientific / research
Pilot intended to go to production to supplement existing data
For the production of statistics
Demographic and social statistics
Geo-spatial statistics
Mobility statistics
Tourism statistics
Transportation statistics
Mobile phone operator
Satellite or aerial imagery provider
Research or academic institute
Government institute
---Only for this project-Yes-Only a portion of all dataDepends on the country-YesWe are comparing with existing statistics.--Yes-Data cleaningInstitutional/Business Environment
Coherence, including linkability to other sources-
Machine learning (Random forest, etc.)
Data visualization methods
Traditional statistical methods
Hadoop Clusters
Relational database
GIS
Cloud services
41
39Using satellite imagery and Geo-spatial data for the census of agriculture and the census of building and housingMongoliaNational Statistical Office of MongoliaSatellite imagery or aerial imagery dataPopulation and Housing Census BureauLkhagvadulam Chimeddambalkhagvadulam@nso.mnThe NSO Mongolia has planned to conduct its first agricultural by-census in 2017. For this time around, we are planning to use satellite imagery to identify crop types and estimate the production. In addition to this project, we are also planning to pilot the Census of Building and Housing by integrating different Geo-spatial database with administrative registration database and population database.Pilot intended to go to production to supplement existing data
For the production of statistics
Agricultural statistics
Demographic and social statistics
Geo-spatial statistics
Satellite or aerial imagery providerGovernment instituteDifferent government ministries and agencies are involved in the project of Census of Building and Housing11 - Sustainable Cities & Communities
12 - Responsible Consumption & Production
6 - 6.1; 6.2; 7 - 7.1; 9 - 9.1; 11- 11.1; 12- 12.aYesOnly for this project-No2017All available dataPart of country / high % of marketThe project of integrating by-Census of Agriculture and the Pilot Census of Building and Housing will be carried out in main agricultural regions of the country.FreeThis is an activity organized as part of routine censuses. Therefore, certain government budget is allocated.NoThe project is in its initial phase. Capacity building is highly requested, however the source an provider is uncertain.NoThe data is not obtained yet.Privacy and Security
Completeness, Usability, Time Factors
Accuracy, including selectivity
Validity
Accessibility, Relevance
Data visualization methods
Traditional statistical methods
YesNew estimation method or methodological framework will be developed.Spreadsheet
GIS
42
40Capacity Building in using Big Data as sources for public statisticsCameroonCameroon - National Institute of StatisticsOtherCoordination and ResearchOkouda Barnabebarnabe_okouda@yahoo.frCapacity building to develop national skills in processing Big Data for officials statistics. Share experience and benchmarking with other countries.Pilot intended to go to production to improve timeliness
Pilot intended to go to production to supplement existing data
Pilot intended to go to production to replace existing data
Demographic and social statistics
Vital and civil registration statistics
Environmental statistics
Price statistics
--------------
43
41Capacity building for the use of Big Data for statistical purposesCameroonCameroon - National Institute of Statistics-Coordination and ResearchOkouda Barnabebarnabe_okouda@yahoo.frAs a developing country, Cameroon needs to build capacities and skills in this new domain in statistics. We are exploring these opportunities to learn and adapt methodologies and processing. We expect it be less costly than classic surveys, but challenges seem to be huge in terms of coverage and public acceptance to collaborate.Pilot intended to go to production to improve timeliness
Pilot intended to go to production to supplement existing data
Pilot intended to go to production to replace existing data
Demographic and social statistics
Vital and civil registration statistics
Environmental statistics
Price statistics
Information society / ICT statistics
--------------
44
42Pilot project using social media to create a happiness indexEcuadorEcuador - National Institute of Statistics and CensusesSocial media dataDirectorate of Administrative RecordsCesar Vicunacesar_vicuna@inec.gob.ecDevelop index of happiness based on the use of data from social networksOtherDemographic and social statisticsOther partnersInformation published on the web---Only for this project-No-Only a portion of all dataWhole country / low % of marketFreeThe only cost comes from hired staff.No--YesWe have not done an analysis of Big data quality.Institutional/Business Environment
Privacy and Security
Completeness, Usability, Time Factors
Other methodsNoThe project is in its initial stage and in this moment only testing technology tools We will use algorithms for analysis of feelingsHadoop Clusters
Data mining tools
45
43Improving social transparency in road safety and maintenance operations in Albania through ICT interfacesAlbaniaWorld Bank Group-T&ICTFiona Collinfcollin@worldbank.orgTo improve the transparency of information systems used to manage road safety and maintenance activities in Albania, which has the highest rate of road-related fatalities in SE Europe. We seek to create a two-way virtual data platform whereby citizens can submit feedback on the road network, and also receive information on road conditions. This will enable decision-makers to assess efficiency of past interventions, and respond quickly to emerging road maintenance and safety issues. Resulting in a cost-effective tool for improving transparency in road management operations, informing and empowering citizens, while improving the condition and safety of the road network.--03 - Good Health & Well-Being
11 - Sustainable Cities & Communities
3.6, 11.2Yes-----------
46
44Assessing use of scanner data for compiling the Consumer Price IndexSouth AfricaSouth Africa - Statistics South AfricaScanner dataPrice StatisticsPatrick Kellypatrickke@statssa.gov.zaAssessing the transactional data of large retail chains with the aim of determining their suitability for transforming into data for the Consumer Price Index. They will also be assessed for suitability for the generation of sales values for statistics on Retail trade sector.Exploration
Scientific / research
Economic and financial statistics
Price statistics
--------------
47
45"We feel fine": Big Data Observations of Citizen Sentiment about State Institutions and Social InclusionBrazilWorld Bank Group-GOVVictoria L. Lemieuxvlemieux@worldbank.orgThis exploratory research project, which supports the strategic priorities of the Governance Practice, aims to gain new insights into the relationship among citizens' sentiment about governance institutions, trust in Government, and civil unrest. The approach will be to conduct sentiment analysis of social media feeds over a period of one year in one country (e.g., Brazil) looking at specific institutions (e.g., transport, police, health) to gain insights into how citizens are feeling about their governance institutions, how that translates into feelings about their Governments in the political sense, and how this corresponds to observed citizen behavior.--10 - Reduced Inequalities
16 - Peace, Justice & Strong Institutions
10.2, 10.6, 16.6, 16.8, 16aYes-----------
48
46Data Visualization & Interactive Mapping to Support Response to Disease OutbreakUgandaUN - Global Pulsehttp://www.unglobalpulse.org/mapping-infectious-diseases-This project developed interactive data visualization tools that were used during a typhoid outbreak in Uganda to analyze dynamic information about case data and risk factors in support of the national task force managing the outbreak. This project was done in collaboration with WHO and the Ugandan Ministry of Health. (Project webpage)--03 - Good Health & Well-Being
09 - Industry, Innovation & Infrastructure
12 - Responsible Consumption & Production
17 - Partnerships for the Goals
3.3, 3b, 3d, 9c, 12.4, 17.8Yes-----------
49
47Using web scraping price data for price index of e-commerceChinaChina - National Bureau of StatisticsWeb scraping dataResearch Institute of Statistical ScienceJiang Shu13911506021@163.comCrawling the particular cellphone price data by Crawler program and establishing the daily price index as a reference for the monthly price data.ExplorationPrice statistics---Only for this projectNoFrom March 2014 to September 2014Only a portion of all dataWhole country / high % of marketFreeSelf-developed.No-Quality of source/inputStill evaluating and not applied yet.NoAll figures are real transaction price data of suppliers.Institutional/Business Environment
Privacy and Security
Completeness, Usability, Time Factors
Accuracy, including selectivity
Coherence, including linkability to other sources
Validity
Accessibility, Relevance
Traditional statistical methodsNoUsing the data by current statistical methodsColumn store database
Spreadsheet
Develop small database to calculate data.
50
48Crop survey by farmland: using satellite and aerial remote sensing to help estimate agricultural statisticsChinaChina - National Bureau of StatisticsSatellite imagery or aerial imagery dataResearch Institute of Statistical ScienceJiang Shu13911506021@163.comBuild up the spatial sampling frame by using the data from land use surveys and agricultural census. Then update the sampling frame by satellite and aerial remote sensing. With the samples selected by spatial sampling method, we estimate the crop planting area and output every season.Pilot intended to go to production to replace existing dataAgricultural statisticsSatellite or aerial imagery providerGovernment institute---Broader access rights-YesSome fixed time intervals every yearOnly a portion of all dataPart of country / low % of marketCommercialYes-Quality of processing/throughputYesPrivacy and Security
Completeness, Usability, Time Factors
Coherence, including linkability to other sources
Accessibility, Relevance
Supervised learning
Decision Trees
Traditional statistical methods
YesRelational database
GIS
51
49The comparison between the data of interbank transactions and the retail sales: credit card data for use in verifying retail salesChinaChina - National Bureau of StatisticsCredit card dataResearch Institute of Statistical ScienceJiang Shu13911506021@163.comWe get the year-on-year data and the chain data of the credit card transaction amounts in different industries from the headquarters of UnionPay on a monthly basis. Then we use the data to verify the growth trends of the retail sales.Scientific / researchEconomic and financial statistics---Broader access rightsWe hope to explore more Big Data applications for different levels and different purposes by further cooperation.NoSince January 2014Only a portion of all dataWhole country / high % of marketFreeNo-Quality of output statisticsNoPrivacy and Security
Completeness, Usability, Time Factors
Accuracy, including selectivity
Coherence, including linkability to other sources
Validity
Traditional statistical methodsNoSpreadsheet
52
50The application of Big Data for highway and waterway transport statisticsChinaChina - National Bureau of StatisticsRoad sensor data Ships identification dataResearch Institute of Statistical ScienceJiang Shu13911506021@163.comIn 2014, the Joint Transport Ministry has studied the networks of the toll highway system and marine visa system. They found a way to apply the massive administrative records data of these two systems to the highway and waterway transport statistics, and now the method has been applied in most of the country on a trial basis.Pilot intended to go to production to replace existing dataTransportation statisticsTechnology partner
Government institute
---Only for this projectNoMonthly since 2013Only a portion of all dataPart of country / high % of marketFreeNo-Quality of source/inputCompare the internal original data with the external data. Analyze the volatility of the aggregated data and establish review periods.NoWe designed a special audit platform according to the structural characteristics of the original data and aggregate processing data and set up the conditions for approval.Completeness, Usability, Time Factors
Accuracy, including selectivity
Coherence, including linkability to other sources
Validity
Accessibility, Relevance
Data visualization methods
Traditional statistical methods
NoNoSQL database
Column store database
Data mining tools
Data visualization tools
53
51
54
52
55
53
56
54
57
55
58
56
59
57
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100