| FS Home |
Section 4: Analysis Example (HLS Bangladesh) |
| Section 1: Introduction | |
| Section 2: Coping Strategies | |
| Section 3: Computing | |
| Section 4: Analysis Ex. (HLS Bangladesh) | |
| Section 5: Analysis Ex. (HLS Kenya) |
The
The datasets are available and can be used for practice and exploration in connection with this module. As yet they are not available for downloading from the web-based version of the PANDA, but we expect to achieve this soon. They are included on the CD-based version. If you are using the web-based version, please contact us to get the datasets.
The data was collected as a part of the CARE Livelihood Monitoring Project. This project, carried out CARE Bangladesh, was established to allow future assessment of improvements in the livelihoods of Bangladeshis due to the implementation of a variety of development projects. Two separate baseline surveys, a livelihood survey and a nutrition survey, were conducted on the same households. The data from the livelihood survey was consolidated into one large spss database. The data from the nutrition survey, on the other hand was divided into 10 separate small spss files.
The relevant datasets here are as follows:
Household data from nutrition survey: main21.sav
Files of anthropometric and related data: for 12-49yr women, z-scores, Q261.sav
for 12-49yr women, BMI, Q26nut.sav
for 5-11 yr girls, z-scores, Q27nut.sav
for 0-59 month kids, z-scores,Q28nut.sav
Household and nutritional data merged hhbasicmain21.sav
Livelihood survey: NWBaseline.sav
Merged household, nutritional, and livelihood survey files: mergeddataset.sav
The codebooks and questionnaires available are in the following files:
Livelihood survey questionnaire: NW Baseline q’aire.doc
Household data plus nutritional survey: Final Questionnaire.doc (Note: this contains the questions on anthropometry, entered in various files (Q26nut.sav, etc).
The codebooks – giving the meaning of numerical codes in different variables – are included in the questionnaires, and where the variables have been labeled in the SPSS datasets (.sav) can be seen in the ‘variable view’ display in SPSS. [A print can be obtained by opening SPPS, without a dataset in it, then under ‘file’ choose ‘display data info’ and then specifying the dataset for which you want the data listed.]
In order to begin the analysis it was necessary to merge the data, since we need to link variables which are in different data files – for example, food security or nutritional variables by wealth status. There were 10 spss files which contained the information from the nutrition baseline survey. Once the data from the nutrition baseline was consolidated into one file, it was then necessary to merge that file with the large file containing the data from the livelihood baseline survey. Here we describe the process of merging. If you want to practice, certain of the smaller files are attached, as xxxxxx.sav. Otherwise, you can use for analysis the fully merged file (mergeddataset.sav) that resulted from these processes.
The most important aspect of merging files is to determine the correct ‘unique key’, or unique identification number (‘ID’), for every case. This distinguishes each case within a dataset, allowing matching that case to the corresponding case in another dataset, which must have an identical unique ID. It is crucial to be sure of the case definition in each file – this can be ‘child-level’, household-level’, or sometimes higher with aggregated datasets (e.g. ‘district level’). This is the same as knowing the ‘level of analysis’ – see discussion in Analysis module, Ch 1, page 5, under ‘file structure’. In the present example, we are going to merge into a file (call it file A) which has characteristics of household members, adding from a second file (file B) the anthropometric data on that individual. Note that although file A has some household characteristics, and may often be called a household file, each case or row is a different individual. Thus the identifier would be the household ID plus the household member number. This means that some characteristics – say, household construction, or total number of members – are repeated for each individual in the household. Think of it like this – suppose this a section of a dataset (could be a spreadsheet):
|
Cluster Number |
Household (hh) Number |
HH member number |
Household roofing material |
Total number hh members |
Gender |
Age (yrs) |
|
12987 |
16 |
3 |
Thatch |
4 |
M |
25 |
|
12987 |
16 |
4 |
Thatch |
4 |
F |
16 |
|
12987 |
17 |
1 |
Tin |
5 |
M |
45 |
|
12987 |
17 |
2 |
Tin |
5 |
F |
32 |
|
12987 |
17 |
3 |
Tin |
5 |
M |
10 |
|
12987 |
17 |
4 |
Tin |
5 |
M |
3.5 |
|
12987 |
17 |
5 |
Tin |
5 |
F |
0.5 |
|
12987 |
18 |
1 |
Thatch |
3 |
F |
65 |
Note that the case is defined as an individual, so that household characteristics repeat for individuals within the household. Note also that each case can be uniquely defined by the cluster number plus the household number plus the household member number. In fact, a single unique ID number can be constructed by running these together, for example for the second case shown as 12987164.
We can conceive of merging by thinking of additional data we might want, so that we could compare, for example, the wt/age z-score (see analysis module) of the pre-school children, or the heights of adult women. In this case the dataset (or spreadsheet would be like this:
|
Cluster Number |
Household
(hh) Number |
HH
member number |
Household
roofing material |
Total
number hh members |
Gender |
Age (yrs) |
Child’s Wt/age |
|
|
12987 |
16 |
3 |
Thatch |
4 |
M |
25 |
- |
- |
|
12987 |
16 |
4 |
Thatch |
4 |
F |
16 |
- |
- |
|
12987 |
17 |
1 |
Tin |
5 |
M |
45 |
- |
- |
|
12987 |
17 |
2 |
Tin |
5 |
F |
32 |
- |
** |
|
12987 |
17 |
3 |
Tin |
5 |
M |
10 |
- |
- |
|
12987 |
17 |
4 |
Tin |
5 |
M |
3.5 |
* |
- |
|
12987 |
17 |
5 |
Tin |
5 |
F |
0.5 |
* |
- |
|
12987 |
18 |
1 |
Thatch |
3 |
F |
65 |
- |
** |
We want to add, or merge, in numbers for wt/age (* in the cells) or women’s height (**); where these don’t apply they will be left blank or missing.
To prepare the datasets used later for analysis, two types of merge were done. First, the smaller files with individual data, separately created for pre-school children, 5-11 year-olds, etc, were merged into a larger dataset that had household characteristics (e.g. merging Q26nut.sav into main21.sav). Then this merged file was merged again with the livelihood survey file. While both merges were completed in SPSS using the same commands, the particular way in which the unique keys were identified was different in both cases. For this reason, the way in which these merges were carried out is described separately.
In this case, each of the 10 spss files contained a household number and a member number within each household. These two variables when examined together were capable of distinguishing each household and each individual within each household. Thus, these two variables were the unique keys. Once the unique keys were identified, it was just a matter of following the merge process in spss. This process is outlined below, using as the example a merge of the hhbasic file with the preschool anthropometry file:
First, open the hhbasic file, into which additional variables are going to be merged. Then:
The files should then be merged according to the unique keys that were requested.
This merge was much more difficult due to the fact that there was no unique key that matched cases within each of the databases. Instead, it was necessary to create a unique key in both databases. Each database had a member number (of the houshold), district number, upazila number, union number, and village number. Each of these variables if examined together uniquely identifies individual cases within each database.
A logarithm must be developed that takes into account each of these variables. Then a new variable must be created. This variable should be named UNIQUE KEY. Then the logarithm that was developed should be entered into the compute function under TRANSFORM on the toolbar. Once the logarithm is entered hit OK and the unique key will be created.
The logarithm was determined in the following manner:
Unique Key= (DISTRICT * 1,000,000) + (UPAZILA * 10,000) + (
For Example, if one particular case within the database was coded in the following manner:
DISTRICT= 6
UPAZILA= 27
UNION= 58
VILLAGE= 80
Then the Unique Key would be: 6275880
Once the unique key is created in this manner in each database, each case should match based on that variable. Then the regular merging process, outlined above, should be followed. Once the databases have merged, it is important to examine the data to ensure that merge was indeed successful.
The variables of interest in this dataset are those indicators attempting to measure food security. The goal of this analysis is to examine these variables in relation to certain other variables, in this case particularly wealth indicators. These wealth indicators will also be examined in relation to measures of growth such as anthropometric indicators.
Before this analysis can begin, however, the dataset must be cleaned. This is discussed in the Data Analysis module. Those values out of the acceptable range for each variable being examined should then be set to missing. Once the dataset is clean, the analysis can begin.
The first step in analyzing this dataset is to run simple descriptives of each variable that will be examined. General descriptive statistics provide maximum and minimum values as well as the mean value for each variable. It also displays sample sizes.
Once the descriptive statistics have been examined, comparison of means tests should be conducted. Since the primary variables of interest are food security indicators, the first comparison of means test should be conducted between all of the food security indicators,
and the wealth indicator, in this case a wealth index variable. This wealth index variable is categorical, containing the following groupings:
In running the comparison of means test, it is important to note that the food security variables should be the dependent, or outcome, variable. The wealth indicator should be the independent variable. An example of such an analysis is shown in the table below:
|
|
N |
Number of Food Secure Months |
Number of months with Rice Provisioning |
Number of Meals Consumed during Leans Period |
Daily Rice/wheat Intake |
|
Always Poor |
2488 |
3.80 |
0.91 |
2.04 |
2.201 |
|
Usually Poor |
1284 |
4.77 |
1.98 |
2.16 |
2.518 |
|
Cyclical Poor |
1544 |
5.74 |
3.82 |
2.34 |
2.939 |
|
Occasionally Poor |
1476 |
7.24 |
5.89 |
2.63 |
3.513 |
As this table illustrates, each of the food security indicators improves as the level of poverty improves. Thus, as expected, increases in socio-economic status result in an improvement in the food security situation.
To continue the examination of the food security indicators, a new, dichotomous variable recoded from number of meals per day can then be created as follows:
0= more than two meals per day during lean period
1=fewer than two meals per day during lean period
A comparison of means test can then be conducted between this new variable and the wealth index variable. The results of such an analysis are shown in the table below:
|
Wealth Categories |
Mean | N |
| Always Poor |
.8686 |
2488 |
| Usually Poor | .7983 |
1284 |
| Cyclical Poor |
.6457 |
1544 |
| Occasionally Poor |
.3740 |
1476 |
These results show that the prevalence of individuals eating fewer than two meals a day during lean periods decrease as the level of poverty decreases. This analysis reveals that a high percentage of individuals within higher socio-economic groups are able to continue adequate eating patterns even during lean periods. Those in the lower socio-economic group are forced with much greater regularity to go without meals.
Indicators of growth can then be examined to determine if the discrepancies in levels of food insecurity among wealth categories is similar to the discrepancies in growth as measured by z-scores. This can be examined by conducting a comparison of means test between stunting, underweight, and wasting and wealth category for girls 5-11 years of age and children 0-59 months of age. The results of these tests are shown in the tables below:
Z-scores for girls 5-11 years of age by wealth category
|
|
HAZ |
WAZ |
WHZ |
|
Always Poor |
-1.80 (171) |
-1.91 (177) |
-1.21 (117) |
|
Usually Poor |
-1.64 (90) |
-1.81 (91) |
-1.26 (62) |
|
Cyclical Poor |
-1.54 (135) |
-1.74 (136) |
-1.17 (90) |
|
Occasionally Poor |
-1.31 (126) |
-1.61 (130) |
-1.16 (89) |
Z-score for children 0-59 months of age by wealth category
|
|
HAZ |
WAZ |
WHZ |
|
Always Poor |
-1.87 (291) |
-2.12 (300) |
-1.24 (309) |
|
Usually Poor |
-1.74 (118) |
-1.95 (121) |
-1.13 (123) |
|
Cyclical Poor |
-1.72 (130) |
-1.20 (135) |
-1.10 (135) |
|
Occasionally Poor |
-1.61 (115) |
-1.98 (117) |
-1.28 (118) |
These results show a quite similar pattern of z-score improvement as wealth category increases for both stunting and underweight in girls 5-11 years of age. Somewhat of a similar pattern also exists for children 0-59 months old, however, there association does not appear as strong. This could due to caring practices or disease within the first few years of life. However, in both age groups z-scores for wasting do not show an improvement as socio-economic status increases.
When the z-scores are dichotomized and expressed as prevalences and another comparison of means test is conducted, one would expect similar patterns to emerge. As the tables below illustrate, this is indeed the case for both age groups:
Prevalence of stunting, underweight, and wasting by wealth indicator for girls 5-11 years of age
|
|
Stunting |
Underweight |
Wasting |
|
Always Poor |
0.4444 (171) |
0.4972 (177) |
0.1709 (117) |
|
Usually Poor |
0.4333 (90) |
0.4725 (91) |
0.1129 (62) |
|
Cyclical Poor |
0.4148 (135) |
0.4191 (136) |
0.1222 (90) |
|
Occasionally Poor |
0.2937 (126) |
0.3308 (130) |
0.1461 (89) |
Prevalence of stunting, underweight, and wasting by wealth indicator for girls 0-59 months of age
|
|
Stunting |
Underweight |
Wasting |
|
Always Poor |
0.4811 (291) |
0.5500 (300) |
0.1942 (309) |
|
Usually Poor |
0.4746 (118) |
0.5041 (121) |
0.1138 (123) |
|
Cyclical Poor |
0.4151 (130) |
0.4593 (135) |
0.1259 (135) |
|
Occasionally Poor |
0.3565 (115) |
0.4872 (117) |
0.2288 (118) |
In any case, the results of both of these analyses indicate that individuals within the lower two socio-economic groups have particular high levels of food insecurity as defined by the indicators used in the analysis. The highest socio-economic group on the other hand has very low levels of food insecurity.
Additional results.
The wealth index was created by assessing the following household assets:
It is unclear exactly how this indicator was determined, but as I read more of the report I will let you know.
General Descriptives
|
|
N |
Minimum |
Maximum |
Mean |
Standard Deviation |
|
Wealth Categories |
6792 |
1 |
4 |
2.30 |
1.17 |
|
HAZ (girls 5-11) |
522 |
-3.90 |
3.35 |
-1.5846 |
1.1691 |
|
WAZ (girls 5-11) |
534 |
-3.84 |
1.21 |
-1.7771 |
.8141 |
|
WHZ (girls 5-11) |
358 |
-3.23 |
0.63 |
-1.1956 |
.7343 |
|
HAZ1 (children 0-59) |
654 |
-4.01 |
1.59 |
-1.7717 |
1.0529 |
|
WAZ1 (children 0-59) |
673 |
-3.86 |
0.97 |
-2.0394 |
0.8602 |
|
WHZ1 (children 0-59) |
685 |
-3.20 |
2.00 |
-1.1999 |
0.7798 |
|
|
N |
Minimum |
Maximum |
Mean |
Standard Deviation |
|
Stunting Prev. (girls 5-11) |
522 |
0 |
1 |
0.3985 |
0.4901 |
|
Underweight prev. (girls 5-11) |
534 |
0 |
1 |
0.4326 |
0.4959 |
|
Wasting prev. (girls 5-11) |
358 |
0 |
1 |
0.1425 |
0.3500 |
|
Stunting prev. (children 0-59) |
654 |
0 |
1 |
0.4450 |
0.4973 |
|
Underweight prev. (children 0-59) |
673 |
0 |
1 |
0.5126 |
0.5002 |
|
Wasting prev. (children 0-59) |
685 |
0 |
1 |
0.1723 |
0.1723 |
Correlations of Z-scores for girls 5-11 years of age
|
|
HAZ |
WAZ |
WHZ |
|
HAZ |
1.0 |
0.842 |
-0.033 |
|
|
522 |
522 |
352 |
|
WAZ |
0.842 |
1.0 |
0.517 |
|
|
522 |
534 |
357 |
|
WHZ |
-0.033 |
0.517 |
1.0 |
|
|
352 |
357 |
358 |
Correlations of Z-scores for children 0-59 months of age
|
|
HAZ |
WAZ |
WHZ |
|
HAZ |
1.0 |
0.743 |
0.120 |
|
|
654 |
653 |
654 |
|
WAZ |
0.743 |
1.0 |
0.700 |
|
|
653 |
673 |
672 |
|
WHZ |
0.120 |
0.700 |
1.0 |
|
|
654 |
672 |
685 |
Food Security
Indicators
Food Security Indicators by Wealth
|
|
N |
Number of Food Secure Months |
Rice Provisioning |
Number of Meals Consumed during Leans Period |
Daily Rice/wheat Intake |
|
Always Poor |
2488 |
3.80 |
0.91 |
2.04 |
2.201 |
|
Usually Poor |
1284 |
4.77 |
1.98 |
2.16 |
2.518 |
|
Cyclical Poor |
1544 |
5.74 |
3.82 |
2.34 |
2.939 |
|
Occasionally Poor |
1476 |
7.24 |
5.89 |
2.63 |
3.513 |
Annual expenditures by wealth category
|
|
Health Expenditure |
Education Expenditure |
Clothes Expenditure |
Shelter Expenditure |
Livestock Expenditure |
|
Always Poor |
1199.44 (2403) |
401.04 (1167) |
1041.84 (2430) |
1014.28 (1557) |
995.48 (286) |
|
Usually Poor |
1599.16 (1277) |
714.36 (706) |
1377.81 (1282) |
1791.87 (901) |
1620.48 (249) |
|
Cyclical Poor |
1663.62 (1520) |
963.25 (1109) |
1745.18 (1544) |
1865.54 (1157) |
1095.81 (431) |
|
Occasionally Poor |
2384.72 (1458) |
1661.77 (1198) |
2324.18 (1476) |
2312.92 (1016) |
1801.74 (605) |
Prevalence of individuals having less than 2 meals a day during lean periods is 0.0447 or 4.47%.
Prevalence of 2 or fewer meals by wealth category
|
Wealth Categories |
Mean |
N |
Standard Deviation |
|
1 |
.8686 |
2488 |
0.3379 |
|
2 |
.7983 |
1284 |
0.4014 |
|
3 |
.6457 |
1544 |
0.4784 |
|
4 |
.3740 |
1476 |
0.4840 |
|
Total |
.6971 |
6792 |
0.4595 |
Anthropometric
Indicators
|
|
HAZ |
WAZ |
WHZ |
|
Always Poor |
-1.80 (171) |
-1.91 (177) |
-1.21 (117) |
|
Usually Poor |
-1.64 (90) |
-1.81 (91) |
-1.26 (62) |
|
Cyclical Poor |
-1.54 (135) |
-1.74 (136) |
-1.17 (90) |
|
Occasionally Poor |
-1.31 (126) |
-1.61 (130) |
-1.16 (89) |
Prevalence of stunting, underweight, and wasting by wealth indicator for girls 5-11 years of age
|
|
Stunting |
Underweight |
Wasting |
|
Always Poor |
0.4444 (171) |
0.4972 (177) |
0.1709 (117) |
|
Usually Poor |
0.4333 (90) |
0.4725 (91) |
0.1129 (62) |
|
Cyclical Poor |
0.4148 (135) |
0.4191 (136) |
0.1222 (90) |
|
Occasionally Poor |
0.2937 (126) |
0.3308 (130) |
0.1461 (89) |
Z-score for children 0-59 months of age by wealth category
|
|
HAZ |
WAZ |
WHZ |
|
Always Poor |
-1.87 (291) |
-2.12 (300) |
-1.24 (309) |
|
Usually Poor |
-1.74 (118) |
-1.95 (121) |
-1.13 (123) |
|
Cyclical Poor |
-1.72 (130) |
-1.20 (135) |
-1.10 (135) |
|
Occasionally Poor |
-1.61 (115) |
-1.98 (117) |
-1.28 (118) |
Prevalence of stunting, underweight, and wasting by wealth indicator for girls 0-59 months of age
|
|
Stunting |
Underweight |
Wasting |
|
Always Poor |
0.4811 (291) |
0.5500 (300) |
0.1942 (309) |
|
Usually Poor |
0.4746 (118) |
0.5041 (121) |
0.1138 (123) |
|
Cyclical Poor |
0.4151 (130) |
0.4593 (135) |
0.1259 (135) |
|
Occasionally Poor |
0.3565 (115) |
0.4872 (117) |
0.2288 (118) |
Anthropometric
Indicators by District
District by wealth category and Z-scores for girls 5-11 years of age
|
|
Wealth Category |
HAZ |
WAZ |
WHZ |
|
District 1 |
2.42 (2059) |
-1.50 (158) |
-1.68 (160) |
-1.15 (115) |
|
District 2 |
2.37 (1075) |
-1.57 (95) |
-1.80 (100) |
-1.22 (67) |
|
District 3 |
2.27 (1222) |
-1.46 (109) |
-1.68 (110) |
-1.13 (71) |
|
District 4 |
2.23 (904) |
-2.05 (72) |
-1.97 (75) |
-1.06 (45) |
|
District 5 |
2.36 (465) |
-1.18 (38) |
-1.63 (38) |
-1.37 (28) |
|
District 6 |
2.03 (1067) |
-1.79 (50) |
-2.05 (51) |
-1.50 (32) |
District by prevalence of stunting, underweight, and wasting for girls 5-11 years old
|
|
Stunting Prevalence |
Underweight Prevalence |
Wasting Prevalence |
|
District 1 |
0.3354 (158) |
0.3750 (160) |
0.1304 (115) |
|
District 2 |
0.4105 (95) |
0.4600 (100) |
0.1493 (67) |
|
District 3 |
0.3394 (109) |
0.3727 (110) |
0.1408 (71) |
|
District 4 |
0.5972 (72) |
0.5333 (75) |
0.0222 (45) |
|
District 5 |
0.3421 (38) |
0.4211 (38) |
0.3214 (28) |
|
District 6 |
0.4600 (50) |
0.5490 (51) |
0.1875 (32) |
District by Z-scores of children 0-59 months old
|
|
HAZ |
WAZ |
WHZ |
|
District 1 |
-1.75 (181) |
-2.04 (190) |
-1.16 (191) |
|
District 2 |
-1.67 (118) |
-1.90 (123) |
-1.02 (126) |
|
District 3 |
-1.74 (155) |
-2.00 (155) |
-1.24 (158) |
|
District 4 |
-1.92 (79) |
-2.10 (82) |
-1.21 (85) |
|
District 5 |
-1.64 (34) |
-2.00 (35) |
-1.26 (36) |
|
District 6 |
-1.92 (87) |
-2.27 (88) |
-1.43 (89) |
District by prevalence of stunting, underweight, and wasting for children 0-59 months of age
|
|
Stunting Prevalence |
Underweight Prevalence |
Wasting Prevalence |
|
District 1 |
0.3978 (181) |
0.5053 (190) |
0.1518 (191) |
|
District 2 |
0.3898 (118) |
0.4553 (123) |
0.1508 (126) |
|
District 3 |
0.4710 (155) |
0.4903 (155) |
0.1392 (158) |
|
District 4 |
0.5570 (79) |
0.5732 (82) |
0.2118 (85) |
|
District 5 |
0.3824 (34) |
0.4857 (35) |
0.1667 (36) |
|
District 6 |
0.4943 (87) |
0.6023 (88) |
0.2697 (89) |