FS Home

Section 4:  Analysis Example

(HLS Bangladesh)

Section 1:  Introduction
Section 2:  Coping Strategies
Section 3:  Computing
Section 4:  Analysis Ex. (HLS Bangladesh)
Section 5:  Analysis Ex. (HLS Kenya)

 

 


 

 

The Bangladesh Livelihood Monitoring Survey

 

As part an initiative by CARE/DFID, a large baseline survey was carried out in April 2001.  This gives an excellent example of such a survey, and is used here to lead through some key aspects of data handling and analysis.  As usual, you should be familiar with the analytical module for the principles applied – this module illustrates some of these for this particular type of survey, measuring many aspects of livelihood and food security.

 

The datasets are available and can be used for practice and exploration in connection with this module.  As yet they are not available for downloading from the web-based version of the PANDA, but we expect to achieve this soon.  They are included on the CD-based version.  If you are using the web-based version, please contact us to get the datasets.

 

The data was collected as a part of the CARE Livelihood Monitoring Project.  This project, carried out CARE Bangladesh, was established to allow future assessment of improvements in the livelihoods of Bangladeshis due to the implementation of a variety of development projects.  Two separate baseline surveys, a livelihood survey and a nutrition survey, were conducted on the same households.  The data from the livelihood survey was consolidated into one large spss database. The data from the nutrition survey, on the other hand was divided into 10 separate small spss files. 

 

The relevant datasets here are as follows:

 

Household data from nutrition survey:    main21.sav

Files of anthropometric and related data:           for 12-49yr women, z-scores, Q261.sav

                                                                        for  12-49yr women, BMI, Q26nut.sav 

                                                                        for 5-11 yr girls, z-scores, Q27nut.sav

                                                                        for 0-59 month kids, z-scores,Q28nut.sav

 

Household and nutritional data merged  hhbasicmain21.sav

 

Livelihood survey:                                             NWBaseline.sav

 

Merged household, nutritional, and livelihood survey files:  mergeddataset.sav

 

The codebooks and questionnaires available are in the following files: 

 

Livelihood survey questionnaire:                        NW Baseline q’aire.doc

Household data plus nutritional survey:              Final Questionnaire.doc (Note: this contains the questions on anthropometry, entered in various files (Q26nut.sav, etc).

 

The codebooks – giving the meaning of numerical codes in different variables   are included in the questionnaires, and where the variables have been labeled in the SPSS datasets (.sav) can be seen in the ‘variable view’ display in SPSS.  [A print can be obtained by opening SPPS, without a dataset in it, then under ‘file’ choose ‘display data info’ and then specifying the dataset for which you want the data listed.]

 

 

 

Merging of SPSS files

 

In order to begin the analysis it was necessary to merge the data, since we need to link variables which are in different data files – for example, food security or nutritional variables by wealth status.  There were 10 spss files which contained the information from the nutrition baseline survey.  Once the data from the nutrition baseline was consolidated into one file, it was then necessary to merge that file with the large file containing the data from the livelihood baseline survey.  Here we describe the process of merging.  If you want to practice, certain of the smaller files are attached, as xxxxxx.sav.   Otherwise, you can use for analysis the fully merged file (mergeddataset.sav) that resulted from these processes.

 

The most important aspect of merging files is to determine the correct ‘unique key’, or unique identification number (‘ID’), for every case.  This distinguishes each case within a dataset, allowing matching that case to the corresponding case in another dataset, which must have an identical unique ID.  It is crucial to be sure of the case definition in each file – this can be ‘child-level’, household-level’, or sometimes higher with aggregated datasets (e.g. ‘district level’).  This is the same as knowing the ‘level of analysis’ – see discussion in Analysis module, Ch 1, page 5, under ‘file structure’.  In the present example, we are going to merge into a file (call it file A) which has characteristics of household members, adding from a second file (file B) the anthropometric data on that individual.  Note that although file A has some household characteristics, and may often be called a household file, each case or row is a different individual.  Thus the identifier would be the household ID plus the household member number.  This means that some characteristics – say, household construction, or total number of members – are repeated for each individual in the household.  Think of it like this – suppose this a section of a dataset (could be a spreadsheet):

 

 

Cluster

Number

Household (hh)

Number

HH member number

Household roofing

material

Total number hh members

Gender

Age

(yrs)

12987

16

3

Thatch

4

M

25

12987

16

4

Thatch

4

F

16

12987

17

1

Tin

5

M

45

12987

17

2

Tin

5

F

32

12987

17

3

Tin

5

M

10

12987

17

4

Tin

5

M

3.5

12987

17

5

Tin

5

F

0.5

12987

18

1

Thatch

3

F

65

 

Note that the case is defined as an individual, so that household characteristics repeat for individuals within the household.  Note also that each case can be uniquely defined by the cluster number plus the household number plus the household member number.  In fact, a single unique ID number can be constructed by running these together, for example for the second case shown as 12987164.

 

We can conceive of merging by thinking of additional data we might want, so that we could compare, for example, the wt/age z-score (see analysis module) of the pre-school children, or the heights of adult women.  In this case the dataset (or spreadsheet would be like this:

 

Cluster

Number

Household (hh)

Number

HH member number

Household roofing

material

Total number hh members

Gender

Age

(yrs)

Child’s

Wt/age

 

12987

16

3

Thatch

4

M

25

-

-

12987

16

4

Thatch

4

F

16

-

-

12987

17

1

Tin

5

M

45

-

-

12987

17

2

Tin

5

F

32

-

**

12987

17

3

Tin

5

M

10

-

-

12987

17

4

Tin

5

M

3.5

*

-

12987

17

5

Tin

5

F

0.5

*

-

12987

18

1

Thatch

3

F

65

-

**

 

 

We want to add, or merge, in numbers for wt/age (* in the cells) or women’s height (**); where these don’t apply they will be left blank or missing.

 

To prepare the datasets used later for analysis, two types of merge were done.  First, the smaller files with individual data, separately created for pre-school children, 5-11 year-olds, etc, were merged into a larger dataset that had household characteristics (e.g. merging Q26nut.sav into main21.sav).  Then this merged file was merged again with the livelihood survey file. While both merges were completed in SPSS using the same commands, the particular way in which the unique keys were identified was different in both cases.  For this reason, the way in which these merges were carried out is described separately.

 

 

 

Merging of the Files from the Nutritional Baseline Survey

 

In this case, each of the 10 spss files contained a household number and a member number within each household.  These two variables when examined together were capable of distinguishing each household and each individual within each household.  Thus, these two variables were the unique keys.  Once the unique keys were identified, it was just a matter of following the merge process in spss.  This process is outlined below, using as the example a merge of the hhbasic file with the preschool anthropometry file:

 

            First, open the hhbasic file, into which additional variables are going to be merged. Then:

  1. On the toolbar, click DATA
  2. Go to MERGE FILES
  3. There is a choice between ADD VARIABLES and ADD CASES, click ADD VARIABLES
  4. Click on file that you want to merge
  5. Check box next to MATCH CASES ON KEY VARIABLES IN SORTED FILES
  6. Click on unique key/keys (hh# and mem# in this case) and click on the arrow to move them over to the KEY VARIABLES window
  7. Click on EXTERNAL FILE IS KEYED TABLE
  8. Click on OK

 

The files should then be merged according to the unique keys that were requested. 

 

 

 

Merging of Consolidated Nutrition Baseline Database with Livelihood Database

 

This merge was much more difficult due to the fact that there was no unique key that matched cases within each of the databases.  Instead, it was necessary to create a unique key in both databases.  Each database had a member number (of the houshold), district number, upazila number, union number, and village number.  Each of these variables if examined together uniquely identifies individual cases within each database. 

 

A logarithm must be developed that takes into account each of these variables.  Then a new variable must be created. This variable should be named UNIQUE KEY.  Then the logarithm that was developed should be entered into the compute function under TRANSFORM on the toolbar.  Once the logarithm is entered hit OK and the unique key will be created.   

 

The logarithm was determined in the following manner:

 

Unique Key= (DISTRICT * 1,000,000) + (UPAZILA * 10,000) + (UNION * 100) + (VILLAGE)

 

For Example, if one particular case within the database was coded in the following manner:

 

DISTRICT= 6

UPAZILA= 27

UNION= 58

VILLAGE= 80

 

Then the Unique Key would be:  6275880

 

Once the unique key is created in this manner in each database, each case should match based on that variable.  Then the regular merging process, outlined above, should be followed.  Once the databases have merged, it is important to examine the data to ensure that merge was indeed successful. 

 

 

Data Analysis

 

The variables of interest in this dataset are those indicators attempting to measure food security.  The goal of this analysis is to examine these variables in relation to certain other variables, in this case particularly wealth indicators.  These wealth indicators will also be examined in relation to measures of growth such as anthropometric indicators.     

 

 

Data Cleaning

 

Before this analysis can begin, however, the dataset must be cleaned.  This is discussed in the Data Analysis module.  Those values out of the acceptable range for each variable being examined should then be set to missing.  Once the dataset is clean, the analysis can begin.

 

 

Analysis and Results

 

The first step in analyzing this dataset is to run simple descriptives of each variable that will be examined.  General descriptive statistics provide maximum and minimum values as well as the mean value for each variable.  It also displays sample sizes.    

 

Once the descriptive statistics have been examined, comparison of means tests should be conducted.  Since the primary variables of interest are food security indicators, the first comparison of means test should be conducted between all of the food security indicators,

 

  • Number of months household has adequate food to feed all of its members
  • Number of months household can feed themselves from their own production
  • Number of meals consumed per day during lean period
  • Daily rice/wheat intake of the household (in Kg.)

 

and the wealth indicator, in this case a wealth index variable.  This wealth index variable is categorical, containing the following groupings:

 

  • Always Poor
  • Usually Poor
  • Cyclical Poor
  • Occasionally Poor

 

In running the comparison of means test, it is important to note that the food security variables should be the dependent, or outcome, variable.  The wealth indicator should be the independent variable.  An example of such an analysis is shown in the table below:

 

 

N

Number of Food Secure Months

Number of months with Rice Provisioning

Number of Meals Consumed during Leans Period

Daily Rice/wheat Intake

Always Poor

2488

3.80

0.91

2.04

2.201

Usually Poor

1284

4.77

1.98

2.16

2.518

Cyclical Poor

1544

5.74

3.82

2.34

2.939

Occasionally Poor

1476

7.24

5.89

2.63

3.513

 

As this table illustrates, each of the food security indicators improves as the level of poverty improves.  Thus, as expected, increases in socio-economic status result in an improvement in the food security situation.

 

To continue the examination of the food security indicators, a new, dichotomous variable recoded from number of meals per day can then be created as follows:

 

0= more than two meals per day during lean period

1=fewer than two meals per day during lean period

 

A comparison of means test can then be conducted between this new variable and the wealth index variable.  The results of such an analysis are shown in the table below:

 

Wealth Categories

Mean N
Always Poor

.8686

2488
Usually Poor .7983

1284

Cyclical Poor

.6457

1544

Occasionally Poor

.3740

1476

 

 

These results show that the prevalence of individuals eating fewer than two meals a day during lean periods decrease as the level of poverty decreases.  This analysis reveals that a high percentage of individuals within higher socio-economic groups are able to continue adequate eating patterns even during lean periods.  Those in the lower socio-economic group are forced with much greater regularity to go without meals. 

 

Indicators of growth can then be examined to determine if the discrepancies in levels of food insecurity among wealth categories is similar to the discrepancies in growth as measured by z-scores.  This can be examined by conducting a comparison of means test between stunting, underweight, and wasting and wealth category for girls 5-11 years of age and children 0-59 months of age.  The results of these tests are shown in the tables below:

 

Z-scores for girls 5-11 years of age by wealth category

 

HAZ

WAZ

WHZ

Always Poor

-1.80 (171)

-1.91 (177)

-1.21 (117)

Usually Poor

-1.64 (90)

-1.81 (91)

-1.26 (62)

Cyclical Poor

-1.54 (135)

-1.74 (136)

-1.17 (90)

Occasionally Poor

-1.31 (126)

-1.61 (130)

-1.16 (89)

 

 

Z-score for children 0-59 months of age by wealth category

 

HAZ

WAZ

WHZ

Always Poor

-1.87 (291)

-2.12 (300)

-1.24 (309)

Usually Poor

-1.74 (118)

-1.95 (121)

-1.13 (123)

Cyclical Poor

-1.72 (130)

-1.20 (135)

-1.10 (135)

Occasionally Poor

-1.61 (115)

-1.98 (117)

-1.28 (118)

 

 

These results show a quite similar pattern of z-score improvement as wealth category increases for both stunting and underweight in girls 5-11 years of age.  Somewhat of a similar pattern also exists for children 0-59 months old, however, there association does not appear as strong.  This could due to caring practices or disease within the first few years of life.  However, in both age groups z-scores for wasting do not show an improvement as socio-economic status increases. 

 

When the z-scores are dichotomized and expressed as prevalences and another comparison of means test is conducted, one would expect similar patterns to emerge.  As the tables below illustrate, this is indeed the case for both age groups:

 

Prevalence of stunting, underweight, and wasting by wealth indicator for girls 5-11 years of age

 

Stunting

Underweight

Wasting

Always Poor

0.4444 (171)

0.4972 (177)

0.1709 (117)

Usually Poor

0.4333 (90)

0.4725 (91)

0.1129 (62)

Cyclical Poor

0.4148 (135)

0.4191 (136)

0.1222 (90)

Occasionally Poor

0.2937 (126)

0.3308 (130)

0.1461 (89)

 

 

Prevalence of stunting, underweight, and wasting by wealth indicator for girls 0-59 months of age

 

Stunting

Underweight

Wasting

Always Poor

0.4811 (291)

0.5500 (300)

0.1942 (309)

Usually Poor

0.4746 (118)

0.5041 (121)

0.1138 (123)

Cyclical Poor

0.4151 (130)

0.4593 (135)

0.1259 (135)

Occasionally Poor

0.3565 (115)

0.4872 (117)

0.2288 (118)

 

 

 

In any case, the results of both of these analyses indicate that individuals within the lower two socio-economic groups have particular high levels of food insecurity as defined by the indicators used in the analysis.  The highest socio-economic group on the other hand has very low levels of food insecurity. 

 

 

Additional results.

 

Bangladesh Survey

 

The wealth index was created by assessing the following household assets:

 

  • Watch
  • Gold
  • Radio cassette player
  • TV
  • Fan
  • Clothes
  • Power tiller
  • Threshing machine
  • Irrigation pumps
  • Transport
  • Fishing gears
  • Agricultural tools
  • Cattle
  • Buffalo
  • Goats
  • Sheep
  • Poultry
  • Trees
  • Bamboo stands
  • Agriculture
  • Homestead
  • Fallow land

 

It is unclear exactly how this indicator was determined, but as I read more of the report I will let you know.

 

 

General Descriptives

 

 

N

Minimum

Maximum

Mean

Standard Deviation

Wealth Categories

6792

1

4

2.30

1.17

HAZ (girls 5-11)

522

-3.90

3.35

-1.5846

1.1691

WAZ (girls 5-11)

534

-3.84

1.21

-1.7771

.8141

WHZ (girls 5-11)

358

-3.23

0.63

-1.1956

.7343

HAZ1 (children 0-59)

654

-4.01

1.59

-1.7717

1.0529

WAZ1 (children 0-59)

673

-3.86

0.97

-2.0394

0.8602

WHZ1 (children 0-59)

685

-3.20

2.00

-1.1999

0.7798

 

 

 

 

N

Minimum

Maximum

Mean

Standard Deviation

Stunting Prev.

(girls 5-11)

522

0

1

0.3985

0.4901

Underweight prev.

(girls 5-11)

534

0

1

0.4326

0.4959

Wasting prev.

(girls 5-11)

358

0

1

0.1425

0.3500

Stunting prev.

(children 0-59)

654

0

1

0.4450

0.4973

Underweight prev. (children 0-59)

673

0

1

0.5126

0.5002

Wasting prev.

(children 0-59)

685

0

1

0.1723

0.1723

 

 

 

 

 

Correlations of Z-scores for girls 5-11 years of age

 

 

HAZ

WAZ

WHZ

HAZ

1.0

0.842

-0.033

 

522

522

352

WAZ

0.842

1.0

0.517

 

522

534

357

WHZ

-0.033

0.517

1.0

 

352

357

358

 

 

Correlations of Z-scores for children 0-59 months of age

 

 

HAZ

WAZ

WHZ

HAZ

1.0

0.743

0.120

 

654

653

654

WAZ

0.743

1.0

0.700

 

653

673

672

WHZ

0.120

0.700

1.0

 

654

672

685

 

 

 

Food Security Indicators

 

Food Security Indicators by Wealth

 

N

Number of Food Secure Months

Rice Provisioning

Number of Meals Consumed during Leans Period

Daily Rice/wheat Intake

Always Poor

2488

3.80

0.91

2.04

2.201

Usually Poor

1284

4.77

1.98

2.16

2.518

Cyclical Poor

1544

5.74

3.82

2.34

2.939

Occasionally Poor

1476

7.24

5.89

2.63

3.513

 

 

Annual expenditures by wealth category

 

Health Expenditure

Education Expenditure

Clothes Expenditure

Shelter Expenditure

Livestock Expenditure

Always Poor

1199.44 (2403)

401.04 (1167)

1041.84 (2430)

1014.28 (1557)

995.48 (286)

Usually Poor

1599.16 (1277)

714.36 (706)

1377.81 (1282)

1791.87 (901)

1620.48 (249)

Cyclical Poor

1663.62 (1520)

963.25 (1109)

1745.18 (1544)

1865.54 (1157)

1095.81 (431)

Occasionally Poor

2384.72 (1458)

1661.77 (1198)

2324.18 (1476)

2312.92 (1016)

1801.74 (605)

 

Prevalence of individuals having less than 2 meals a day during lean periods is 0.0447 or 4.47%.

 

Prevalence of 2 or fewer meals by wealth category

 

Wealth Categories

Mean

N

Standard Deviation

1

.8686

2488

0.3379

2

.7983

1284

0.4014

3

.6457

1544

0.4784

4

.3740

1476

0.4840

Total

.6971

6792

0.4595

 

 

Anthropometric Indicators

 

 Z-scores for girls 5-11 years of age by wealth category

 

HAZ

WAZ

WHZ

Always Poor

-1.80 (171)

-1.91 (177)

-1.21 (117)

Usually Poor

-1.64 (90)

-1.81 (91)

-1.26 (62)

Cyclical Poor

-1.54 (135)

-1.74 (136)

-1.17 (90)

Occasionally Poor

-1.31 (126)

-1.61 (130)

-1.16 (89)

 

Prevalence of stunting, underweight, and wasting by wealth indicator for girls 5-11 years of age

 

Stunting

Underweight

Wasting

Always Poor

0.4444 (171)

0.4972 (177)

0.1709 (117)

Usually Poor

0.4333 (90)

0.4725 (91)

0.1129 (62)

Cyclical Poor

0.4148 (135)

0.4191 (136)

0.1222 (90)

Occasionally Poor

0.2937 (126)

0.3308 (130)

0.1461 (89)

 

 

Z-score for children 0-59 months of age by wealth category

 

HAZ

WAZ

WHZ

Always Poor

-1.87 (291)

-2.12 (300)

-1.24 (309)

Usually Poor

-1.74 (118)

-1.95 (121)

-1.13 (123)

Cyclical Poor

-1.72 (130)

-1.20 (135)

-1.10 (135)

Occasionally Poor

-1.61 (115)

-1.98 (117)

-1.28 (118)

 

 

Prevalence of stunting, underweight, and wasting by wealth indicator for girls 0-59 months of age

 

Stunting

Underweight

Wasting

Always Poor

0.4811 (291)

0.5500 (300)

0.1942 (309)

Usually Poor

0.4746 (118)

0.5041 (121)

0.1138 (123)

Cyclical Poor

0.4151 (130)

0.4593 (135)

0.1259 (135)

Occasionally Poor

0.3565 (115)

0.4872 (117)

0.2288 (118)

 

 

 

 

Anthropometric Indicators by District

 

District by wealth category and Z-scores for girls 5-11 years of age

 

Wealth Category

HAZ

WAZ

WHZ

District 1

2.42 (2059)

-1.50 (158)

-1.68 (160)

-1.15 (115)

District 2

2.37 (1075)

-1.57 (95)

-1.80 (100)

-1.22 (67)

District 3

2.27 (1222)

-1.46 (109)

-1.68 (110)

-1.13 (71)

District 4

2.23 (904)

-2.05 (72)

-1.97 (75)

-1.06 (45)

District 5

2.36 (465)

-1.18 (38)

-1.63 (38)

-1.37 (28)

District 6

2.03 (1067)

-1.79 (50)

-2.05 (51)

-1.50 (32)

 

 

 

District by prevalence of stunting, underweight, and wasting for girls 5-11 years old

 

Stunting Prevalence

Underweight Prevalence

Wasting Prevalence

District 1

0.3354 (158)

0.3750 (160)

0.1304 (115)

District 2

0.4105 (95)

0.4600 (100)

0.1493 (67)

District 3

0.3394 (109)

0.3727 (110)

0.1408 (71)

District 4

0.5972 (72)

0.5333 (75)

0.0222 (45)

District 5

0.3421 (38)

0.4211 (38)

0.3214 (28)

District 6

0.4600 (50)

0.5490 (51)

0.1875 (32)

 

 

District by Z-scores of children 0-59 months old

 

HAZ

WAZ

WHZ

District 1

-1.75 (181)

-2.04 (190)

-1.16 (191)

District 2

-1.67 (118)

-1.90 (123)

-1.02 (126)

District 3

-1.74 (155)

-2.00 (155)

-1.24 (158)

District 4

-1.92 (79)

-2.10 (82)

-1.21 (85)

District 5

-1.64 (34)

-2.00 (35)

-1.26 (36)

District 6

-1.92 (87)

-2.27 (88)

-1.43 (89)

 

 

District by prevalence of stunting, underweight, and wasting for children 0-59 months of age

 

Stunting Prevalence

Underweight Prevalence

Wasting Prevalence

District 1

0.3978 (181)

0.5053 (190)

0.1518 (191)

District 2

0.3898 (118)

0.4553 (123)

0.1508 (126)

District 3

0.4710 (155)

0.4903 (155)

0.1392 (158)

District 4

0.5570 (79)

0.5732 (82)

0.2118 (85)

District 5

0.3824 (34)

0.4857 (35)

0.1667 (36)

District 6

0.4943 (87)

0.6023 (88)

0.2697 (89)