Ola Hall

Mixing household surveys, satellite imagery and machine learning in human development studies: Is it (finally) time for satellite imagery in social science research?

This project aims to enhance our understanding of the pace of agricultural and rural transformation in contemporary sub-Saharan Africa, its poverty and distributional impacts and drivers. The project addresses a longstanding and to date unresolved theoretical question in agrarian studies, namely whether agricultural transformation is poverty driven as proposed by neo-Marxist perspectives or if it enables inclusive growth as propounded by advocates of the pro-poor, agricultural growth model. Building on the methodological advances in machine learning and artificial intelligence, we combine training data from our existing survey-based database (Afrint), covering around 3 000 households from six African countries, distributed over sixteen different regions, 54 villages and spanning fifteen years of development. Drawing on this unique combination of data sources and methods, we will be able to provide new insights into the distributional effects of agricultural transformation, using a variety of established welfare indicators from the field of rural development studies, but breaking new ground in development research by collecting them through remote sensing techniques. This innovative mixed methods approach will also provide a real-life contribution to addressing a practical problem of collecting statistics on the ground in developing countries that lack infrastructure or administrative resources.
Final report
Scientific Final Report MXM19-1104:1

Project: Mixing household surveys, satellite imagery and machine learning in human development studies: Is it (finally) time for satellite imagery in social science research?
Project period: 2020–2022
Principal Investigator: Ola Hall

1. Project Aim and Research Questions
The project aimed to enhance understanding of agricultural and rural transformation in contemporary sub-Saharan Africa, with particular emphasis on poverty reduction, distributional dynamics, and underlying drivers. It addressed a longstanding theoretical debate in agrarian studies: whether agricultural transformation primarily generates inclusive growth, as proposed in pro-poor agricultural growth models, or whether it drives differentiation and pauperization, as suggested by neo-Marxist perspectives

The overarching aim was operationalized through four core research questions:

1. What are the distributional dynamics of changes in poverty and agricultural yields over time?
2. How is agricultural transformation related to commercialization, and what are the distributional consequences?
3. What are the consequences of pluriactivity (income diversification beyond farming)?
4. How do these processes vary spatially and how are they interconnected?

Methodologically, the project proposed a mixed-methods design integrating household survey data, high-resolution satellite imagery, nighttime lights data, vegetation indices, deep learning, transfer learning, and econometric modelling.

2. Methodological Development and Data Integration
The project implemented the proposed analytical workflow described in the application (see Figure I, p. 13 in the application)

The analytical flow consisted of:
• Collation and harmonization of survey data (Afrint and other household datasets)
• Extraction of high-resolution daytime satellite imagery
• Integration of nighttime lights (DMSP-OLS and VIIRS)
• Use of vegetation indices (MODIS, TERRA, AQUA)
• Training of convolutional neural networks (CNNs)
• Transfer learning
• Econometric modelling with multilevel specifications

Extensive datasets were assembled, including:
• Afrint panel data (approximately 3,000 households)
• Expanded survey datasets reaching up to 100,000 households across sub-Saharan Africa
• DHS village-level data
• Nighttime lights data covering the African continent
• Daytime satellite mosaics for multiple countries, including Tanzania

The project operationalized poverty, intensification, commercialization, and pluriactivity at village level, consistent with the original design. Deep learning methods were developed and evaluated for poverty prediction from satellite imagery. Empirical testing demonstrated that CNN-based models outperformed human experts in ranking poverty levels from satellite images (Sarmadi et al., 2024; Wahab & Hall, 2022). At the same time, interpretability research showed that models primarily captured large-scale geographic signals rather than fine-grained multidimensional poverty characteristics (Watmough et al., 2025; Hall et al., 2022). This prompted further work on explainable AI and domain bias.

3. Empirical Findings

3.1 Agricultural Transformation and Rural Differentiation
Longitudinal village-level analyses demonstrated broad-based improvements in rural living standards across several African contexts, but also increasing differentiation, particularly related to land and asset ownership (Andersson Djurfeldt, 2022). Food transfers and translocal livelihoods emerged as central mechanisms sustaining rural households across rural–urban spaces (Andersson Djurfeldt, 2022, Journal of Rural Studies).

Case-based research on the cassava industry in Ghana documented local-level agricultural transformation driven by commercialization and technological change (Andersson et al., 2024). Yield gap research showed that socio-economic and institutional factors are critical alongside agronomic variables (Hall et al., 2024; Wahab et al., 2020; Sussy et al., 2019).
These findings contribute to the debate outlined in the original application by demonstrating that agricultural transformation produces both inclusive improvements and differentiation processes, depending on context and scale

3.2 Satellite Imagery, Poverty Mapping and AI

The project significantly advanced methodological understanding of poverty estimation from Earth Observation data:
• Review articles synthesized the state of the field (Hall et al., 2023; Hall et al., 2022).
• Empirical studies demonstrated CNN superiority relative to human raters (Sarmadi et al., 2024).
• Bias and interpretability issues were identified (Watmough et al., 2025).
• Experiments using large language models for poverty ranking were conducted (Sarmadi et al., 2025).
• Drone-based approaches were reviewed and positioned within spatial social sciences (Hall & Wahab, 2021).

The findings confirm the feasibility of estimating welfare proxies from satellite imagery, but also demonstrate limitations in causal interpretation and multidimensional poverty representation — concerns anticipated in the original theory-of-science discussion

4. Theoretical and Methodological Contributions

The project contributes to:
1. Agrarian theory:
By empirically examining spatial and distributional dynamics, it provides evidence relevant to the debate between pro-poor growth and differentiation perspectives
2. Mixed methods research:
It operationalizes a realist mixed-methods framework combining AI-derived indicators with econometric causal modelling, as originally proposed
3. Remote sensing in social science:
It demonstrates both the promise and limits of replacing or complementing survey-based data with satellite-derived indicators.
4. Explainable AI in development research:

It advances understanding of interpretability, bias, and causal limitations in poverty mapping.

5. Policy Relevance and Broader Impact
The project directly addresses the problem of “data poverty” in sub-Saharan Africa described in the original application.

It demonstrates:
• The value of integrating satellite imagery with survey data.
• The vulnerability of relying on single institutional data sources (as illustrated in the VR report text)
• The resilience of globally distributed satellite infrastructures.
• The cost-efficiency of repurposing and enhancing existing datasets.

The research contributes to improved monitoring possibilities relevant to Sustainable Development Goals, consistent with the ambitions stated in the application.
Grant administrator
Lunds universitet
Reference number
MXM19-1104:1
Amount
SEK 5,139,000
Funding
Mixed methods
Subject
Human Geography
Year
2019