Sampling and Weighting with DHS Data
At long last, The DHS Program has released two videos which demonstrate how to weight DHS data, concluding the Sampling and Weighting video series.
The first video in the series, Introduction to DHS Sampling Procedures, as well as the second
video, Introduction of Principles of DHS Sampling Weights, explained the basic concepts of sampling and weighting in The DHS Program surveys using the 2012 Tajikistan DHS survey as an example. Read our introductory blog post for more details.
In contrast, the third and fourth videos use an Example Practice Dataset, so viewers can practice weighting DHS data and replicate what is being shown in the videos while they are watching. The Example Practice Dataset was specifically created for DHS data users to have hands-on practice using DHS data in different statistical packages (Stata, SPSS and SAS) and does not represent the data of any actual country.
The third video, How to Weight DHS Data in Stata, explains which weight to use based on the unit of analysis, describes the steps of weighting DHS data in Stata and demonstrates both ways to weight DHS data in Stata (simple weighting and weighting that accounts for the complex survey design).
The fourth video, Demonstration on How to Weight DHS Data in SPSS and SAS, is the same as the third video, except it uses the statistical software packages SPSS and SAS instead of Stata.
After watching these videos, you will be able to answer the following questions:
- Which weights should I use for my analysis?
- What are the steps of weighting data in a statistical software package?
- How do I weight DHS data in Stata, SPSS or SAS?
- How do I account for the complex sample design when weighting in Stata, SPSS or SAS?
If you have more questions, visit the user forum!
What did you learn from the sampling and weighting videos? What would you like to explore further? Comment below!
Dear
I still have confusion in how to weighting DHS sample. I also had poor internet access in Ethiopia to follow the videos in examples. How would you help me in order to weight the Ethiopian 2011 DHS.
Hello Seman, the DHS user forum (http://userforum.dhsprogram.com/) is a great way to ask questions and receive feedback from the broader community and DHS Program staff. We also have the DHS sampling manual (http://dhsprogram.com/publications/publication-dhsm4-dhs-questionnaires-and-manuals.cfm) as an additional resource. Hope this helps!
I wish to calculate mean of height-for-age z-score (HAZ) and their standard error considering the sampling technique. The results I want to estimate by small areas like sub-district/district. I am using 2011 BDHS data and trying to calculate mean HAZ and mean of (HAZ < -2.00 SD) according to sub-district.
I am using STATA command as below. Fortunately we get reasonable results for district but unrealistic results by sub-district, particularly for small size sub-district. For some sub-districts, the mean HAZ becomes zero with zero standard error and similar is observed for HAZ= 601.
gen HW70n= HW70/100.
g COSUBDIST= CODIST*100+ COTHANA
svyset [pw= V005_rewtd], psu (V001) strata (V023)
univar HW70n, by (COSUBDIST)
tabstat HW70n , by(COSUBDIST) stat(n, mean semean)
# HAZ < -2.0
g HW_2=0.
replace HW_2=1 if HW70n <= -2.
tabstat HW_2 , by(COSUBDIST) stat(n, mean semean)
# Results are like theses.
OSUBDIST | N mean se(mean) sd
———-+—————————————-
108 | 18 .0555556 .0555556 .2357023
114 | 15 .0666667 .0666667 .2581989
134 | 14 .0714286 .0714286 .2672612
156 | 18 .1666667 .0903877 .3834825
160 | 17 .1176471 .0805474 .3321056
177 | 18 .0555556 .0555556 .2357023
373 | 16 .125 .0853913 .341565
409 | 25 .16 .0748331 .3741657
428 | 32 .03125 .03125 .1767767
447 | 8 0 0 0
485 | 11 .0909091 .0909091 .3015113
602 | 8 0 0 0
603 | 11 0 0 0
607 | 21 .1904762 .0878052 .4023739
610 | 23 .0434783 .0434783 .2085144
632 | 12 .0833333 .0833333 .2886751
636 | 15 .1333333 .0908514 .3518658
651 | 107 .0747664 .0255462 .2642517
662 | 19 .1578947 .085947 .3746343
Can you explain why I get such results of zero? Can I do such spatial analysis in such way?
Regards,
Sumon
Great question! I recommend posting this in our user forum. The forum is regularly monitored to ensure all questions are answered by our knowledgeable staff.
Hello DHS Program!
How to deal with non-response, when an entire cluster is dropped (for instance due to security or inaccessibility or bad data)? Shall that cluster be included in household response rate? why? how?
Why such this in not discussed in the internet?
Hello!
This question is not in reference to DHS but I’m hoping to get help from someone through this forum.
How can we use weights on a data set that is originally representative at provincial level and make it representative at district level?
Since the variables I’m using are at a district level, using a provincially representative data set will result in a sampling bias. So I’m trying to somehow weight the data in a way that it becomes representative at the district level.
It would be of immense help if i can get a response for my query.
This is a great question for The DHS Program User Forum. Visit the “Weighting Data” thread to post your question or look for help. https://userforum.dhsprogram.com/index.php?t=thread&frm_id=33&
Is it okay to apply svy in multi country DHS data analysis
This is a great question for The DHS Program User Forum. Visit the “Weighting Data” thread to post your question or look for help. https://userforum.dhsprogram.com/index.php?t=thread&frm_id=33&
Is it possible to have a different total (of in-migrants for instance) when using or not the weights? I am using the HH weights and the PR dataset in order to find the number of in-migrants moving from their place of birth to their place of residence. But, using the weights, the number of total in-migrants is different from the one without the weigths.
Hello !
How can I weight for three merged surveys of a country (ZDHS) to calculate U5M using STATA and R commands?
Regards,
Amanuel
This a great question for The DHS Program User Forum: https://userforum.dhsprogram.com.
We also recommend watching our GitHub tutorial video playlist: https://www.youtube.com/watch?v=Q_FZogyugmI&list=PLagqLv-gqpTMf-DP0QyGOqklG0n5pz9AJ.
Or check out our tutorial video series on sampling & weighting: https://www.youtube.com/watch?v=DD5npelwh80&list=PLagqLv-gqpTN8IZQBy7vAYw10NjynAn2Z.