Category Archives: Data

24 Aug

How Things Have Changed! Looking Back at Data Distribution Practices from 20 Years Ago

A lot can change in 20 years. For The DHS Program, it’s the difference between over 250 datasets for 70 separate surveys to more than 10,000 datasets from over 300 surveys. The contents of the model survey questionnaires changed radically, as did the media used for data distribution. And two decades ago, the internet had only recently emerged as a potential means of communication around the world!

It might be hard to imagine life without internet access today – for us, we rely on the internet for many of our activities. In 1995, The DHS Program established a website which had the basics: an informational brochure, survey statuses, fact sheets, press releases, and newsletters.

Though the website has been updated several times since then, it still has these basic features. The crucial difference lies in how we only provided an archive of publications and data and information on how to place an order for them. Yes, users had to pay for the cost of media – which, at the time, included diskettes (AKA a floppy disk), Bernoulli cartridges, and CD-ROMS – and shipping. At one point, we were deciding on whether to charge for the data itself, to ensure the fullest use of the data.

That decision was part of a proposal from 20 years ago, which proposed the following data dissemination over the internet:

  1. DHS data
  2. India NFHS data
  3. Report text
  4. Online newsletter (tentatively named ‘DHS Discoveries’)
  5. User forum

These look familiar, don’t they? Today, both reports and datasets are free and available over the internet for download (though we still require users to apply for access to datasets), we email our newsletter to subscribers (which includes news, new publications and datasets, and articles that have cited DHS data), and the User Forum has been live since February 2013.

The DHS Program has utilized the internet beyond what was proposed 20 years ago; to name only a few ways, the creations of STATcompiler, development of eLearning courses for data visualization and social media for global health, and utilization of social media to engage with our users. And if you want to know what is coming next, be sure to Follow or Like us on social media, subscribe to our newsletter or even this very blog you are just a few clicks away!

This blog post is based on the rediscovery of the paper prepared for the Population Association of America (PAA) meeting back in 1996. Go back in time and read the original paper here!

11 Jul

World Population Day 2017

How well do you know your population pyramids? Celebrate World Population Day with The DHS Program’s Guess the Population Pyramid Quiz!

See how you stack up against others and share your results below in the comment section, on Facebook, or Twitter! We are also having a live version of Guess the #PopPyramid on Twitter July 11 at 10AM EST.

Take the full-screen version of the quiz here.

Good luck!

14 Jun

An Age of Change: A More Precise Way to Measure Children’s Age in DHS Surveys

DHS-7 surveys are using a more precise method to calculate children’s age. The change, though far-reaching, has very little impact on interpretation and use of DHS data for program managers and policymakers. It does, however, have major implications for researchers doing secondary analysis of DHS data. If you are working with DHS datasets, a full description of the changes to the age-related variables is documented on The DHS Program website, and a brief summary is presented below.

Background:

For most of DHS history, interviewers have collected age data by asking the month and year of birth of the respondent, her age in years, month and year of marriage or age at marriage, and month and year of birth of each of her children as well as the age of living children. For children under 5 who are weighed and measured to assess nutritional status, day of birth was collected in the household questionnaire but was not connected with the birth history. Beginning with the DHS-7 questionnaires (most surveys with fieldwork in 2015/2016 and beyond), we asked the day of birth for all children listed in the birth history.

Why was day of birth added for children in DHS-7?

Adding day of birth permits calculating the age of children more accurately. Calculating age in months using just month and year of birth and month and year of interview meant that age in months could be off by one month in approximately half of all cases. For example, a child born February 2017 was considered a 3-month-old in May 2017. However, if the birth took place on February 25, 2017, and the interview was May 3, 2017, then the child is actually only two completed months old. Thus, if the day of birth is greater than the day of interview (roughly half of all cases), then the age would be over-estimated by one month.

Why make the change now?

Historically DHS surveys have not collected the day of birth of all children as the quality of reporting of dates of births and ages was simply not reliable enough, especially for older children or those who have died. The quality of date and age reporting for children has improved over time and now appears to be sufficiently reliable for use throughout the survey data.

How is the age calculation different in DHS-7?

Previously, child’s age was calculated by subtracting the month and year of birth from the month and year of interview to give age in months. In DHS-7, we introduced the calculation of age taking into account the day of birth and the day of interview. To do this, we introduced a new concept – the century day code (CDC).  DHS datasets now contain several new variables related to the century day codes.

For more details on the definition of the CDC and a list of the new variables, a complete description of changes made to existing age variables (e.g. age of child in years, age of child in months, and birth intervals), and programming notes for STATA and SPSS users, visit The DHS Program website.

How do these changes affect analysis?

In surveys that introduced the day of birth of the child, changes have been made in the analysis of the data in two main ways:

  1. The restrictions on the denominator for tables now all use the age variables based on the calculation to the day, rather than to the month as was previously done.
  2. All background age group variables used in analysis are now based on the revised ages. Previously, on average, because the calculation method only considered month and year and not day of birth, the age group of 0 months would have roughly half the number of cases of age group 1 month or other older single month age groups. With the new method, age group 0 months will have a roughly similar number of cases as other single month age groups.

These changes affect virtually all tables related to children, particularly to children under 5.

It is important to note that fertility rate and childhood mortality rate tables are not impacted as these tables exclude the month of interview from calculations and effectively use complete months in the calculations.

More precise calculation results in a shift in age

The diagrams below show the age of the child calculated using the old and new methods, given a particular month of interview and month of birth, giving examples here for interviews in January to June 2017, and births in December 2015 to June 2017. For any birth taking place on a day in the month on or before the day of interview there is no change in the calculation, but for any birth taking place on a day in the month after the day of interview the age of the child is now calculated as 1 month less than previously. For example, a child born in late April 2017 and included in an interview in early June 2017 (equivalent to a point in the bottom right corner of box “2” in the first row below, marked with a red star) was calculated as 2 months using the old method, but looking at the equivalent position in the second example, this child is calculated as age 1 month in the new calculation method.

Old age calculation method example:

New age calculation method example:

This shift in age in months affects roughly half of all children, but only has an effect on age in years for roughly 1/24 of children – those previously classified as 12 months old, but now classified as 11 months old, and similarly around ages 24 months, 36 months, etc.

While these changes will unlikely have a major impact on the interpretation of trends, they do mark a significant shift towards a more precise, accurate measure of children’s age.  Dataset users striving to replicate DHS tabulations need to adjust their logic to match DHS results using some of the new or modified variables to capture the more accurate measure of child age.

Download the full PDF here.

Questions?  After reviewing the full guidance document, please visit the DHS User Forum and post additional questions there for discussion.

© 2012 Xinshu She/Boston Children’s Hospital Global Pediatrics Fellow, Courtesy of Photoshare

15 May

Everything You Need to Know about DHS Data and More

So, you’re new to DHS and you’ve registered as a DHS data user, downloaded the free available datasets, but now what? We have the perfect resources to get you started.

The following videos provide an overview of DHS data answering key questions such as, what is a data file or dataset? What is the difference between De Jure and De Facto? What types of data files are available for download?

Starting with the Introduction to DHS Datasets, this video provides a guide to units of analysis, basic terminology, and DHS data files.

As mentioned in the video above, separate data files are created for different units of analysis. DHS Dataset Types in 60 Seconds runs through the most common data files and what they contain.

De Jure and De Facto are terms that you will see often within DHS reports and datasets. The following video breaks down what the terms mean, and how they apply to analyzing DHS data.

And finally, where is the information about interviewed households and individuals located in different data files? The Introduction to DHS Data Structure examines DHS datasets in a hierarchical structure.

We will have more videos released this summer, but for those who are still eager to learn more about DHS data, check out DHS Dataset Names Explained below.

 

11 Apr

New Data Available from DHS-7 Questionnaire: Maternal and Pregnancy-Related Mortality

Baby Kabuche, 30 yrs old, 4 months pregnant, outside her house. Baby has 2 children: Eric, 12, living with granparents in Musoma and Judith, 6, living with her and her husband. She works in a factory manufacturing alluminium pots and iron rods. But as she becqme pregnant she took some unpaid leave as the factory uses acid and other toxic materials and she cares for the safety and health of her baby. Baby got malaria only once as she sleeps under mosquito net all the time. This new one makes her happy as it is treated with mosquito repellent and it is more effective.

© 2016 Riccardo Gangale/VectorWorks, Courtesy of Photoshare

In 2014, The DHS Program began the process of updating the standard DHS questionnaires. With input from stakeholders, feedback from in-country implementing agencies, and a host of lessons learned from the previous 5-year program, we added, modified, and, in some cases, deleted questions. For many indicators, the actual questionnaire did not require an adjustment, but the calculation of indicators or the tabulation of the data needed an update to reflect new international indicators and best practices.

While questionnaire revision started in 2014, it can take a long time to see this exercise bear fruit. The 2015-16 Malawi DHS, for example, went into the field with the DHS-7 updated questionnaires in October 2015. The final report and dataset for the 2015-16 Malawi DHS were released in March 2017, allowing us to explore the new data for the first time.

In this blog series, New Data Available from DHS-7 Questionnaire, we will be detailing, topic by topic, some of the key changes to the questionnaire, with a focus on why the changes were made, how the changes affect the tabulations, and some guidance on how the resulting data should be interpreted.

Part 1:  Maternal and Pregnancy-Related Mortality

DHS surveys now collect data to provide the maternal mortality ratio in line with the definition provided by WHO. For almost 30 years, The DHS Program has collected data on maternal mortality in a subset of countries. In previous DHS cycles, maternal mortality was defined as any death to a woman while pregnant, during childbirth, or within two months of delivery. The WHO definition of maternal mortality is more precise:  any death to a woman during pregnancy, childbirth, or within 42 days of delivery but not from accidental or incidental causes (see full WHO definition here). The new DHS-7 questionnaire allows us to calculate the maternal mortality ratio (MMR) in closer alignment with this more precise WHO definition.

As always, women interviewed in the DHS are asked to list their siblings. The interviewer then collects information about the siblings’ survival status. In the case of female siblings who have died at age 12 or older, the interviewer inquires whether or not the sister died during pregnancy, childbirth, or within the 2 months following delivery. If the sister died within 2 months after childbirth, the interviewer asks how many days after childbirth the sister died. This clarification on the number of days is a new addition to the DHS-7 questionnaire. The interviewer then asks additional questions to determine if the death was accidental or due to violence. In DHS-7 these deaths are excluded from the calculation of the MMR per the WHO definition.

Why?  These changes were made to improve the precision of the MMR, as well as to align the DHS estimation of the MMR with the standard definition provided by the WHO.

Implications:  While the newly added questions allow for a more precise and up-to-date measure of maternal mortality, the change does present challenges for interpretation. DHS has reported on maternal mortality for 30 years, but estimates obtained using the new definition of maternal mortality cannot be directly compared to the old definition of maternal mortality which included deaths up to 2 months after delivery and did not exclude deaths due to accidents and violence.

And yet, one of the main objectives for conducting DHS surveys is to provide trend data. Fortunately, the old definition of maternal mortality can still be applied to calculate the mortality ratio estimate comparable to estimates from previously collected mortality data. This less precise measure of mortality is referred to as the pregnancy-related mortality ratio (PRMR).

DHS reports that include the maternal mortality module will now contain both the maternal mortality ratio and the pregnancy-related mortality ratio. The maternal mortality ratio will be used as the primary point estimate, but the pregnancy-related mortality ratio will be shown in an additional table and in figures to illustrate the trend. Keep in mind that the new measure of maternal mortality, by definition, will result in a lower maternal mortality ratio than the old measure because the accidental and violence-related deaths to women during the maternal period and deaths occurring between 42 days and 2 months after childbirth are being excluded from maternal deaths while using the new definition but included while using the old definition.

Summary of Maternal Mortality and Pregnancy-related Mortality:

Maternal Mortality Ratio The number of maternal deaths to any woman during pregnancy, childbirth, or within 42 days of delivery excluding accidents and acts of violence per 100,000 live births More precise Not comparable to surveys before DHS-7
Pregnancy-related Mortality Ratio The number of pregnancy-related deaths (deaths to a woman during pregnancy or delivery or within 2 months of the termination of a pregnancy, from any cause, including accidents or violence per 100,000 live births Less precise Comparable to previous surveys; shown to allow for trend  interpretation

The DHS-7 questionnaire includes additional prompts to fully capture more siblings and siblings’ deaths. In previous DHS questionnaires, women were asked to list their siblings in order and then were asked follow-up questions about their survival status. In the DHS-7 adult mortality module, respondents are asked to list their siblings without worrying about their order but are then asked a list of probing questions to ensure that all siblings have actually been recorded. This change is likely to produce a more complete list of siblings for which information on adult and maternal mortality is collected. Once a complete list is produced they are then ordered and the questions on their survival status and age or age at death and years since death, as well as the maternal mortality related questions, are then asked as applicable. 

Why?  Several studies have suggested that respondents’ lists of siblings are not always complete. This often happens when the sibling is a half-brother or sister, when the sibling did not live with the respondent as a child, or when the sibling has died. A pre-test in Ghana indicated that the addition of these probing questions resulted in capturing additional siblings for about 10% of women.

Implications:  Omissions in the sibling history can affect the adult and maternal mortality ratios in different ways. The inclusion of more siblings tends to increase the adult mortality rate. This is because often the siblings who were previously omitted were not spontaneously mentioned because they have already died. However, studies suggest that these deaths are not disproportionately maternal deaths, so a more complete sibling listing might result in a lower maternal mortality ratio.

Key Take-Aways

The changes described above may sound confusing for non-demographers.  The major points to remember for DHS data users include:

  • The new Maternal Mortality Ratio is not comparable with previous measures of maternal mortality in DHS surveys
  • For trends, look at Pregnancy-related Mortality Ratio
  • Despite the different names, both measures include deaths during pregnancy. The MMR is a more precise measure as it excludes some of the deaths during pregnancy that were not related to pregnancy (i.e. accidents and acts of violence).
  • Maternal mortality is still a relatively rare event, and therefore both MMR and PRMR have wide confidence intervals. Both measures are always presented with their confidence interval so that the user can draw their own conclusions about the relative certainty of the point estimate.
21 Mar

7 Tips to Matching DHS Final Report Tables

Can't match DHS Final Report tables?
Feeling frustrated because you can’t match DHS Final Report tables in your statistical software?

 

Our new four-part video series shows the Top 7 Tips & Tricks for Matching DHS Final Report Tables.

In this four-part video series, we will be covering the top 7 tips and tricks to matching The DHS Program Final Reports using a statistical software program.

The videos will guide you through the following questions:

  1. Are you using the correct data file?
  2. Are you using the correct denominator of cases?
  3. Are you using the correct variable(s)?
  4. Are you properly recoding?
  5. Are you applying the correct weights?
  6. Are you selecting the correct software specific code?
  7. Are you properly coding the tabulation commands in your statistical program?

Watch the four videos in the series below on Matching DHS Final Report tables to get all the details on the top 7 tips and tricks.


Additional help can be found on our website and the User Forum.
10 Feb

Where Statistics are Beautiful

Hans Rosling created a world where “statistics are beautiful” and data are entertaining. The staff at The DHS Program have always believed these things to be true but found it difficult to convince the masses. And then came Gapminder and the juggernaut of Hans Rosling’s charismatic, informative, and perspective-changing data presentations.

The DHS Program was heartbroken to learn of Hans Rosling’s death earlier this week. DHS has enjoyed a long and enthusiastic relationship with Dr. Rosling. In 2009, The DHS Program and USAID had the honor of welcoming Dr. Rosling as our keynote speaker at the DHS 25th anniversary celebration in Washington, DC. What is particularly striking in watching the video again after 8 years, is the laughter. Before Hans Rosling, no one would have believed that a data presentation could be so engaging and witty while being so insightful.

In addition to being entertaining and informative, Dr. Rosling was exceptionally modest and gracious. He came to the DHS 25th anniversary event at his own cost, and credited USAID and DHS data with his own success. He thanked USAID and the US taxpayers saying, “Nothing in my career would have been possible without DHS data.”

But really we, at The DHS Program, owe Hans Rosling a tremendous debt of gratitude. Dr. Rosling was a great advocate not just for DHS data, but for all data. He understood, better than anyone else, that data are worthless unless they are used. And he succeeded in doing what many of us have attempted and failed:  he made data come alive.  He used the data to expose the many incorrect notions about development that even people working in the field have, and he did it with such unique charm and flair. His presentations inspired people to think in different ways and to take action.

To Hans Rosling’s family, we thank you for sharing Hans with the world, and for so willingly joining his mission to “edutain” us. All of us at The DHS Program mourn the loss of this warm, generous visionary. This week, more than ever, we commit to continue the work that Hans has started, and will be inspired by Hans Rosling’s leadership and ingenuity as we look for new ways to provide the world with actionable, understandable data.

08 Feb

Update: Downloadable Citations for DHS Final Survey Reports Now Available

Is this how you look when you’re compiling your references?
via GIPHY
A recent DHS comparative report included references to 52 Demographic and Health Surveys.  You could spend hours entering bibliographic information, or you can download the citations directly into your reference software.

In 2015, The DHS Program announced the availability of downloadable citations for all DHS analytical reports.

And now, in 2017, we are pleased to announce that the reference information for ALL (more than 300 of them!) DHS, SPA, MIS, and AIS final survey reports are also available for download. As with the previous release, citation information can be downloaded in two ways:

-Individually on each publication page or

-As part of a full library of DHS Final survey reports:

Endnote capture

We’ve also provided some additional information on our recommended citation style, and how to achieve it in the various reference management software. Read more about downloadable citations and citation styles on our website.

25 Jan

A New DHS Questionnaire: Interviewing Fieldworkers

There’s a new survey in town. But it’s probably not what you expect. For 30 years, The DHS Program has trained thousands of fieldworkers to conduct over 300 surveys – but who are these fieldworkers? It is well documented that interviewers affect the quality of the data being collected, for example, in the areas of response rates and response validity. So what interviewer characteristics lead to the best data quality? Have fieldworkers worked on a DHS survey before? Are the fieldworkers similar to the respondents they are interviewing? Until now, answers to these and other questions have not been quantified.

fieldworker

© Blake Zachary, ICF

In 2014, The DHS Program piloted a fieldworker survey in Cambodia. Data were collected from all 114 fieldworkers. We collected information on their age, sex, marital status, religion, educational level, experience with other surveys, and languages spoken. Taken on their own, the survey results may not be all that interesting. About three–quarters of the fieldworkers had been educated beyond secondary school, almost half had been involved in a previous DHS survey, and about one-third had no children. But when these survey results are compared with DHS response rates and results, they may help to explain certain patterns.

Take, for example, the question of child mortality. Our new DHS fieldworker questionnaire asks if an interviewer has had a child who died. Is this interviewer more likely to collect accurate data on infant and child mortality? Or might she try to avoid the topic?

While all interviewers undergo intense training on the DHS questionnaires, the rapport between interviewer and interviewee is integral to data quality. Will survey respondents be more likely to refuse participation in the survey if the interviewer appears to be better educated or too young? Are unmarried interviewers sufficiently comfortable asking questions about sexual practices, family planning, and child birth? Are experienced interviewers better interviewers or are they too jaded to do a good job?

The pilot study in Cambodia proved that collecting information from interviewers was both feasible and potentially informative. Starting with the 2015 Zimbabwe DHS, the fieldworker questionnaire has been a standard part of the survey, and the dataset is released along with the traditional DHS survey dataset.Zimbabwe dataset

The potential research questions are endless. And now, with the first public release of the fieldworker survey dataset as part of the 2015 Zimbabwe DHS, analysts will be able to explore these data themselves.

11 Jan

Measuring the SDGs: The Role of Household Surveys

The Sustainable Development Goals (SDGs) have replaced the Millennium Development Goals with broad and lofty aspirations ranging from health, education, and gender equality to clean energy and responsible consumption.

Sustainable Development GoalsBehind each Sustainable Development Goal is a series of targets and each target can be measured by one or more indicators. Many of the targets in the areas of good health, zero hunger, no poverty, quality education, gender equality, clean water and sanitation, and reduced inequalities can be measured directly from DHS surveys. In fact, in many cases, this information has been collected as part of the DHS for decades, and indicator data already exist.

For example, the second SDG, “Zero Hunger,” is supported by 8 targets. One of these is: “By 2030, end all forms of malnutrition, including achieving, by 2025, the internationally agreed targets on stunting and wasting in children under 5 years of age, and address the nutritional needs of adolescent girls, pregnant and lactating women and older persons” (Target 2.2).

Target 2.2 of SDGs

This is where DHS comes in. DHS surveys have measured the height and weight of children under 5 since the 1980s. These measurements are compared to international reference standards to calculate stunting and wasting.Trends in Stunting in South Asia

As DHS data in the STATcompiler show, 4 countries in South Asia have made progress in reducing stunting since the 1990s, but stunting in this region is still unacceptably high. Future surveys will assess whether or not they can achieve a 40% reduction (the international target) by 2025.

Similarly, the SDG for Good Health and Well Being includes a target on reducing childhood mortality: “By 2030, end preventable deaths of newborns and children under 5 years of age, with all countries aiming to reduce neonatal mortality to at least as low as 12 per 1,000 live births and under-5 mortality to at least as low as 25 per 1,000 live births” (Target 3.2).

Childhood mortality data have been collected as a standard part of DHS surveys since 1985. While neonatal and under-five mortality have declined in many DHS countries, the target of 25 under-five deaths for every 1,000 live births is still a long way off for many. In Tanzania, for example, under-five mortality has dropped steadily since 1999 but is not yet near the international target.

Under-five mortality in East Africa

Other SDG-supporting indicators currently collected in DHS surveys include access to safe water and improved toilet facilities, early marriage, family planning demand satisfied, antenatal care coverage, and birth registration. Others are not part of the DHS standard questionnaire but are often collected in optional modules, such as the maternal mortality ratio, female genital cutting, and violence against women.

In addition, new questions were added to the DHS questionnaire at the beginning of DHS-7 (2013-2018). The data resulting from these questions are starting to appear in DHS final reports and respond to SDG indicators such as clean cooking fuel, tobacco use, internet access, bank accounts, and mobile telephone ownership. A new DHS module on accidents and injuries will respond to the SDG indicator on road traffic accidents. A full list of the DHS-related SDG indicators can be found on the SDGs page of the DHS website.

Demand for Family Planning videoBut as always, collecting data is not enough. The DHS Program is also working to make the DHS-related SDGs easier to find, interpret, and use. This past year we released a video tutorial on the complicated “Demand for Family Planning Satisfied” indicator, and worked with partner Blue RasterDemand for Family Planning video to create an SDGs Story Map.

In the coming year, you will see a standard SDGs table for the final reports, addition of an SDGs tag to facilitate location of SDGs in the STATcompiler, and expansion of the SDGs page on our website.

Stay tuned as we develop these tools. And in the meantime, we’ll be out in the field, collecting the data the world needs to monitor progress towards sustainable development.

The information provided on this Web site is not official U.S. Government information and does not represent the views or positions of the U.S. Agency for International Development or the U.S. Government.

The DHS Program, ICF
530 Gaither Road, Suite 500, Rockville, MD 20850
Tel: +1 (301) 407-6500 • Fax: +1 (301) 407-6501
dhsprogram.com