16 Sep

Sampling and Weighting with DHS Data

At long last, The DHS Program has released two videos which demonstrate how to weight DHS data, concluding the Sampling and Weighting video series.

2012 Tajikistan DHS

2012 Tajikistan DHS

The first video in the series, Introduction to DHS Sampling Procedures, as well as the second
video, Introduction of Principles of DHS Sampling Weights, explained the basic concepts of sampling and weighting in The DHS Program surveys using the 2012 Tajikistan DHS survey as an example. Read our introductory blog post for more details.

In contrast, the third and fourth videos use an Example Practice Dataset, so viewers can practice weighting DHS data and replicate what is being shown in the videos while they are watching. The Example Practice Dataset was specifically created for DHS data users to have hands-on practice using DHS data in different statistical packages (Stata, SPSS and SAS) and does not represent the data of any actual country.

The third video, How to Weight DHS Data in Stata, explains which weight to use based on the unit of analysis, describes the steps of weighting DHS data in Stata and demonstrates both ways to weight DHS data in Stata (simple weighting and weighting that accounts for the complex survey design).


The fourth video, Demonstration on How to Weight DHS Data in SPSS and SAS, is the same as the third video, except it uses the statistical software packages SPSS and SAS instead of Stata.

After watching these videos, you will be able to answer the following questions:

  • Which weights should I use for my analysis?
  • What are the steps of weighting data in a statistical software package?
  • How do I weight DHS data in Stata, SPSS or SAS?
  • How do I account for the complex sample design when weighting in Stata, SPSS or SAS?

If you have more questions, visit the user forum!

What did you learn from the sampling and weighting videos? What would you like to explore further? Comment below!

Written by Mahmoud Elkasabi

Mahmoud ElkasabiDr. Elkasabi is a Sampling Statistician at The DHS Program. He joined The DHS Program in 2013 after earning his Ph.D in Survey Methodology from the University of Michigan at Ann Arbor, with a specialty in Survey Statistics and Sampling. Dr. Elkasabi is responsible for the sampling design for the DHS surveys as well as building sampling capacity in many countries, such as Ghana, Egypt, Nigeria, India, Malawi, Zambia, Bangladesh, and Afghanistan. Dr. Elkasabi likes to work closely with the sampling statisticians in different countries. In these win-win relationships, he shares his knowledge in sampling and gains new knowledge & experiences.

4 thoughts on “Sampling and Weighting with DHS Data

  1. Dear
    I still have confusion in how to weighting DHS sample. I also had poor internet access in Ethiopia to follow the videos in examples. How would you help me in order to weight the Ethiopian 2011 DHS.

  2. I wish to calculate mean of height-for-age z-score (HAZ) and their standard error considering the sampling technique. The results I want to estimate by small areas like sub-district/district. I am using 2011 BDHS data and trying to calculate mean HAZ and mean of (HAZ < -2.00 SD) according to sub-district.
    I am using STATA command as below. Fortunately we get reasonable results for district but unrealistic results by sub-district, particularly for small size sub-district. For some sub-districts, the mean HAZ becomes zero with zero standard error and similar is observed for HAZ= 601.

    gen HW70n= HW70/100.


    svyset [pw= V005_rewtd], psu (V001) strata (V023)

    univar HW70n, by (COSUBDIST)

    tabstat HW70n , by(COSUBDIST) stat(n, mean semean)

    # HAZ < -2.0

    g HW_2=0.
    replace HW_2=1 if HW70n <= -2.
    tabstat HW_2 , by(COSUBDIST) stat(n, mean semean)

    # Results are like theses.

    OSUBDIST | N mean se(mean) sd
    108 | 18 .0555556 .0555556 .2357023
    114 | 15 .0666667 .0666667 .2581989
    134 | 14 .0714286 .0714286 .2672612
    156 | 18 .1666667 .0903877 .3834825
    160 | 17 .1176471 .0805474 .3321056
    177 | 18 .0555556 .0555556 .2357023
    373 | 16 .125 .0853913 .341565
    409 | 25 .16 .0748331 .3741657
    428 | 32 .03125 .03125 .1767767
    447 | 8 0 0 0
    485 | 11 .0909091 .0909091 .3015113
    602 | 8 0 0 0
    603 | 11 0 0 0
    607 | 21 .1904762 .0878052 .4023739
    610 | 23 .0434783 .0434783 .2085144
    632 | 12 .0833333 .0833333 .2886751
    636 | 15 .1333333 .0908514 .3518658
    651 | 107 .0747664 .0255462 .2642517
    662 | 19 .1578947 .085947 .3746343

    Can you explain why I get such results of zero? Can I do such spatial analysis in such way?


Leave a Reply

Your email address will not be published. Required fields are marked *

The information provided on this Web site is not official U.S. Government information and does not represent the views or positions of the U.S. Agency for International Development or the U.S. Government.

The DHS Program, ICF
530 Gaither Road, Suite 500, Rockville, MD 20850
Tel: +1 (301) 407-6500 • Fax: +1 (301) 407-6501