# Sampling and Weighting with DHS Data

At long last, The DHS Program has released two videos which** demonstrate how to weight DHS data**, concluding the Sampling and Weighting video series.

The first video in the series, *Introduction to DHS Sampling Procedures*, as well as the second

video, *Introduction of Principles of DHS Sampling Weights*, explained the basic concepts of sampling and weighting in The DHS Program surveys using the 2012 Tajikistan DHS survey as an example. Read our introductory blog post for more details.

In contrast, the third and fourth videos use an *Example Practice Dataset*, so viewers can practice weighting DHS data and replicate what is being shown in the videos while they are watching. The *Example Practice Dataset *was specifically created for DHS data users to have hands-on practice using DHS data in different statistical packages (Stata, SPSS and SAS) and does not represent the data of any actual country.

The third video, *How to **W**eight DHS Data in Stata*, explains which weight to use based on the unit of analysis, describes the steps of weighting DHS data in Stata and demonstrates both ways to weight DHS data in Stata (simple weighting and weighting that accounts for the complex survey design).

The fourth video, *Demonstration on How to Weight DHS Data in SPSS and SAS*, is the same as the third video, except it uses the statistical software packages SPSS and SAS instead of Stata.

After watching these videos, you will be able to answer the following questions:

- Which weights should I use for my analysis?
- What are the steps of weighting data in a statistical software package?
- How do I weight DHS data in Stata, SPSS or SAS?
- How do I account for the complex sample design when weighting in Stata, SPSS or SAS?

If you have more questions, visit the user forum!

**What did you learn from the sampling and weighting videos? What would you like to explore further? Comment below!**

Dear

I still have confusion in how to weighting DHS sample. I also had poor internet access in Ethiopia to follow the videos in examples. How would you help me in order to weight the Ethiopian 2011 DHS.

Hello Seman, the DHS user forum (http://userforum.dhsprogram.com/) is a great way to ask questions and receive feedback from the broader community and DHS Program staff. We also have the DHS sampling manual (http://dhsprogram.com/publications/publication-dhsm4-dhs-questionnaires-and-manuals.cfm) as an additional resource. Hope this helps!

I wish to calculate mean of height-for-age z-score (HAZ) and their standard error considering the sampling technique. The results I want to estimate by small areas like sub-district/district. I am using 2011 BDHS data and trying to calculate mean HAZ and mean of (HAZ < -2.00 SD) according to sub-district.

I am using STATA command as below. Fortunately we get reasonable results for district but unrealistic results by sub-district, particularly for small size sub-district. For some sub-districts, the mean HAZ becomes zero with zero standard error and similar is observed for HAZ= 601.

gen HW70n= HW70/100.

g COSUBDIST= CODIST*100+ COTHANA

svyset [pw= V005_rewtd], psu (V001) strata (V023)

univar HW70n, by (COSUBDIST)

tabstat HW70n , by(COSUBDIST) stat(n, mean semean)

# HAZ < -2.0

g HW_2=0.

replace HW_2=1 if HW70n <= -2.

tabstat HW_2 , by(COSUBDIST) stat(n, mean semean)

# Results are like theses.

OSUBDIST | N mean se(mean) sd

———-+—————————————-

108 | 18 .0555556 .0555556 .2357023

114 | 15 .0666667 .0666667 .2581989

134 | 14 .0714286 .0714286 .2672612

156 | 18 .1666667 .0903877 .3834825

160 | 17 .1176471 .0805474 .3321056

177 | 18 .0555556 .0555556 .2357023

373 | 16 .125 .0853913 .341565

409 | 25 .16 .0748331 .3741657

428 | 32 .03125 .03125 .1767767

447 | 8 0 0 0

485 | 11 .0909091 .0909091 .3015113

602 | 8 0 0 0

603 | 11 0 0 0

607 | 21 .1904762 .0878052 .4023739

610 | 23 .0434783 .0434783 .2085144

632 | 12 .0833333 .0833333 .2886751

636 | 15 .1333333 .0908514 .3518658

651 | 107 .0747664 .0255462 .2642517

662 | 19 .1578947 .085947 .3746343

Can you explain why I get such results of zero? Can I do such spatial analysis in such way?

Regards,

Sumon

Great question! I recommend posting this in our user forum. The forum is regularly monitored to ensure all questions are answered by our knowledgeable staff.

Hello DHS Program!

How to deal with non-response, when an entire cluster is dropped (for instance due to security or inaccessibility or bad data)? Shall that cluster be included in household response rate? why? how?

Why such this in not discussed in the internet?

Hello!

This question is not in reference to DHS but I’m hoping to get help from someone through this forum.

How can we use weights on a data set that is originally representative at provincial level and make it representative at district level?

Since the variables I’m using are at a district level, using a provincially representative data set will result in a sampling bias. So I’m trying to somehow weight the data in a way that it becomes representative at the district level.

It would be of immense help if i can get a response for my query.

This is a great question for The DHS Program User Forum. Visit the “Weighting Data” thread to post your question or look for help. https://userforum.dhsprogram.com/index.php?t=thread&frm_id=33&

Is it okay to apply svy in multi country DHS data analysis

This is a great question for The DHS Program User Forum. Visit the “Weighting Data” thread to post your question or look for help. https://userforum.dhsprogram.com/index.php?t=thread&frm_id=33&

Is it possible to have a different total (of in-migrants for instance) when using or not the weights? I am using the HH weights and the PR dataset in order to find the number of in-migrants moving from their place of birth to their place of residence. But, using the weights, the number of total in-migrants is different from the one without the weigths.