In March 2020, The DHS Program released a call for applications for the 2020 DHS Data Processing Procedures – Data Tabulation and Data Finalization (DPPII) workshop, to be held in Accra, Ghana in June. The DPPII workshop includes online pre-work and face-to-face instruction. DHS Program Data Processing staff members assist participants through one-on-one coaching, and participants gain proficiency through hands-on practice. Due to the COVID-19 pandemic, this in-person workshop was canceled.
The DHS Program’s Data Processing team worked with the Capacity Strengthening team to adapt the DPPII workshop to an online course focused on data tabulation. The course was delivered on The DHS Program Learning Hub and included self-paced modules with readings, videos, and activities, as well as updated CSPro manuals. These up-to-date materials will be used in future data processing courses and workshops, plus trainings for new Data Processing staff at The DHS Program.
The restructured DPPII course is semi-synchronous, including eLearning modules and assignments that participants work through independently. The course also includes four virtual instructor-led sessions, in which participants and DHS Program facilitators login to the same virtual learning space to learn new content, watch presentations, ask and address questions, and receive feedback on assignments in real-time. For their capstone assignment, participants recreate a standard DHS table using CSPro by defining their own variables and data.
Staff from implementing agencies in countries with ongoing DHS surveys are targeted for participation in the DPPII workshop, as participants build competencies required to process DHS data and produce country-specific tables found in DHS final reports. For this first-ever virtual DPPII course, participants included five women and fourteen men from eight Anglophone countries which recently implemented a DHS survey: the Gambia, Ghana, Liberia, Nigeria, Pakistan, Rwanda, Uganda, and Zambia.
What Participants Say
“I’m glad to have been part of this training. [It gave me a] better understanding of the use of DHS data, generation of DHS recode and tables. I hope to practice my new skills with the country-specific tables.”
“Attending training and combining with other duties from work was not helpful but I will take time and continue reading and finish all as they are clear and useful.”
Converting face-to-face workshops to virtual learning sessions comes with challenges. It can be difficult for participants to balance coursework with work and other responsibilities, which is not an issue with in-person residential workshops. Throughout the virtual DPPII training, it became clear that more one-on-one instruction time was needed. To address this, facilitators began holding optional office hours. These and other lessons learned about virtual facilitation will be applied to future online courses, remote technical assistance, and webinars.
Interested in learning more about capacity strengthening opportunities at The DHS Program? The DHS Program periodically makes Workshop and Training Announcements for upcoming training opportunities.
This blog post is part of Luminare, our blog series exploring innovative solutions to data collection, quality assurance, biomarker measurement, data use, and further analysis.
The DHS Program recently published a Methodological Report providing a framework for estimating “level-weights” in DHS surveys – weights that correspond to each stage of sampling. These weights are required for multilevel modeling. While the audience for the framework itself is academic researchers, the challenge of protecting respondent confidentiality while supporting data analysis is of general interest.
We sat down with two of the authors, Mahmoud Elkasabi, Senior Sampling Statistician, and Tom Pullum, Senior Advisor for Research and Analysis, to learn more about this innovative strategy.
How did the idea for this activity come about?
Post from Data User:
I have been reading the posts on the forum regarding the use of weights with multilevel analyses and wanted to check to see if there were any updates on recommendations on how to go about this. . . Since we cannot separate out the household weights from the cluster weights to incorporate them in the statistical coding, does the DHS have any recommendations on how to go about running multilevel models with DHS data? . . . I would like to run multilevel models looking at childhood vaccinations and want to make sure I am going about it in the most proper way. Any help or guidance on this from those at DHS or out in the forum would be greatly appreciated!
Mahmoud: There has been huge user demand for DHS survey level-weights. We have seen many posts on The DHS Program User Forum over the years, where analysts are trying to apply weights in multilevel analysis. It is a common type of research question, to use multilevel modeling to understand the effects of cluster-level characteristics such as region on individual-level outcomes, such as contraceptive use or children’s nutritional status.
For those of us who aren’t statistically inclined, why do researchers need to include sampling weights in their analysis?
Mahmoud: Sampling weights compensate for different probabilities of selection within the samples, and for different levels of non-response. Providing weights at multiple levels allows for the best level of representativeness for that unit. That is, the data from each interviewed woman becomes as representative as possible of similar women in the population. That is ultimately the goal of a survey: to obtain data that are nationally and subnationally representative without interviewing the entire population.
Why aren’t level-weights standardly provided with DHS datasets?
Mahmoud: After a survey is completed, The DHS Program destroys the information required for exact calculation of the cluster weights. Providing the true cluster-level weight for each cluster would pose a risk to respondent confidentiality—anyone with access to the sampling frame could use the cluster-level weights to identify the specific clusters that were drawn in the sample—and then, potentially, identify households or individuals. For that reason, The DHS Program only releases the final survey weights in the datasets.
How does the level-weights framework respond to the challenge of protecting confidentiality?
Tom: We propose a framework that uses publicly available data from DHS datasets and Final Reports, along with a process to estimate other inputs. The framework starts with the household final weight from the household recode file or the woman final weight from the woman recode file. Most of the numbers required to separate the final weight into a cluster-level weight and a household-level (or woman-level) weight are included in the data files or in Appendix A – Sample Design of DHS Final Reports. Some of the required information is not available there (see Table 1), but we provide guidance on how to estimate these inputs with other publicly available data. In this way, we can estimate or approximate the level-weights for the clusters and households (or women).
Have these level-weights been used in any DHS analysis?
Tom: This report shows how to use data from the 2015 Zimbabwe DHS to estimate level-weights and then include them in a multilevel regression model. We fitted several regression models with data for married women in 400 clusters to examine modern contraceptive use with age, education, residence, and number of children as covariates. We provide the STATA code for this example.
The recently released Analytical Study Contraceptive Use, Method Mix, and Method Availabilityis the first DHS research to use the proposed methodology. This analysis used the method described here to estimate cluster-level and woman-level weights and then to assess the effect of cluster-level and woman-level factors on contraceptive use in Haiti and Malawi.
Dr. Mahmoud Elkasabi is a Sampling Statistician at The DHS Program. He joined The DHS Program in 2013 after earning his Ph.D in Survey Methodology from the University of Michigan at Ann Arbor, with a specialty in Survey Statistics and Sampling. Dr. Elkasabi is responsible for the sampling design for the DHS surveys as well as building sampling capacity in many countries, such as Ghana, Egypt, Nigeria, India, Malawi, Zambia, Bangladesh, and Afghanistan. Dr. Elkasabi likes to work closely with the sampling statisticians in different countries. In these win-win relationships, he shares his knowledge in sampling and gains new knowledge & experiences.
Dr. Tom Pullum directs the research program, including the analysis of DHS data beyond the country reports, such as the analytical studies, comparative reports, further analysis studies, and methodological reports. He also has overall responsibility for The DHS Fellows Program and workshops. Current interests include maternal mortality and the measurement of child vulnerability. A continuing effort is the adaptation of demographic methods to statistical frameworks and software. His work with DHS has included methodological reports on data quality. He joined the DHS staff in 2011, following a lengthy career in academia, primarily at the University of Texas at Austin. Dr. Pullum has a Ph.D. in sociology from the University of Chicago.
Anthropometry measurement (height and weight) is a core component of DHS surveys that is used to generate indicators on nutritional status. The Biomarker Questionnaire now includes questions on clothing and hairstyle interference on measurements for both women and children for improved interpretation.