New Model GPS Datasets Make for Easy Practice

Written by: The DHS Program

31 Jan, 2024

Have you ever wanted to explore the geospatial datasets produced by The DHS Program, but you do not know where to begin? Do you want the chance to practice before analyzing country-specific data? Have you ever wondered how The DHS Program displaces survey GPS data? Perhaps you have even used The DHS Program model datasets before but now you want some geospatial context. If so, the new model GPS datasets are for you!

The DHS Program has released GPS datasets for over 200 surveys in 63 countries. Registered data users can link these GPS datasets with the standard survey recode datasets to explore the impact of location on health outcomes. To protect the privacy of respondents, The DHS Program displaces cluster-level GPS coordinates using a standard process described in Geographic Displacement Procedure and Georeferenced Data Release Policy for the Demographic and Health Surveys.

These model GPS datasets are designed to be linked to the existing model datasets. Like the model datasets, the model GPS datasets contains GPS coordinates for 217 clusters across 4 regions in a fictional country. Usually, The DHS Program does not release un-displaced cluster coordinates for surveys to protect respondents’ privacy. Since the model datasets do not represent real households or people, datasets of both the displaced and un-displaced coordinates are included. This allows data users to run analyses on both datasets and compare the results to better understand how displacement impacts data analysis.

Data users can explore both displaced and un-displaced cluster coordinates using software such as QGIS.

The model GPS datasets include two additional geospatial datasets that are standardly provided for DHS Program surveys: survey boundaries and geospatial covariates.

Survey Boundaries

Shapefiles of survey boundaries can be viewed and downloaded from The DHS Program’s Spatial Data Repository. The model GPS datasets include shapefiles that represent regional boundaries and, unlike survey shapefiles, also include boundaries for 27 sub-regions used in the displacement process. These sub-regional boundaries are similar to the administrative level 2 (“Admin 2”) boundaries that The DHS Program Geospatial team use to validate and displace DHS Program survey GPS datasets.

Survey boundaries for the model GPS datasets can now be downloaded directly from The DHS Program’s Spatial Data Repository.

Geospatial Covariates

Geospatial covariate (GC) datasets provide estimates on population, climate, socioeconomic, and environmental variables at the cluster locations. GC datasets are available on The DHS Program website for previously released surveys and can be accessed from The DHS Program’s Spatial Data Repository.

Model GPS datasets are a great tool for teaching and research purposes, and can be used to:

  • Practice merging DHS Program GPS datasets with standard survey recode and geospatial covariate datasets
  • Map survey results at regional and cluster levels
  • Teach spatial analysis methods, and conduct spatial analyses of health outcomes
  • Test the impact of displacement on analyses, such as in the extraction of remotely-sensed data
  • Explore the difference between urban and rural cluster displacement

And, since model datasets do not represent actual respondents, they can be downloaded without having to register from The DHS Program website!

There are three geospatial datasets available for download:

  1. Geographic Data The zipped GE file (zzge61fl.zip) contains two shapefiles, one with the displaced cluster GPS coordinates (ZZGE61FL), and another with the un-displaced cluster GPS coordinates (ZZGE61FL_undisplaced).
  2. Geospatial Covariates The zipped GC file (zzgc61fl.zip) contains .csv tables with the cluster number and covariate values for the displaced and un-displaced cluster locations, as well as the covariate manual describing the covariates and methods.
  3. Boundaries Two zipped survey boundaries files include a regional boundaries shapefile (regional.zip) and the sub-regional boundaries used for displacement (subregional.zip).

The model GE dataset and survey boundaries can be mapped in any GIS, including the free software QGIS. For an introduction to exploring DHS data using QGIS, download learning materials including a QGIS guide and exercise files from The DHS Program’s Spatial Data Repository.

We are excited to hear about how you use the model GPS datasets! If you have any questions about the model GPS datasets or want to share how you use them, email us at modeldatasets@dhsprogram.com.

Author

  • The Demographic and Health Surveys (DHS) Program has collected, analyzed, and disseminated accurate and representative data on population, health, HIV, and nutrition through more than 400 surveys in over 90 countries. The DHS Program is funded by the U.S. Agency for International Development (USAID). Contributions from other donors, as well as funds from participating countries, also support surveys. The project is implemented by ICF.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Anthropometry measurement (height and weight) is a core component of DHS surveys that is used to generate indicators on nutritional status. The Biomarker Questionnaire now includes questions on clothing and hairstyle interference on measurements for both women and children for improved interpretation.