Regression Analysis Stratified by Age Categories

Return to Two-Way Confounding

Confounding occurs when the effects of two processes are not separated. This could be a distortion of apparent exposure effect on the outcome that is actually brought about by the association with other factors that can influence the outcome. Because it is very common practice to include age in a model with nutritional status as the outcome variable and because age would seemingly associate with both nutritional status and measles immunization, it is a likely candidate to explore for confounding.  Other variables could also be involved such as SES and mother's education, but age is a good first step.

The best way to decide which measures might be confounding is to draw spider diagrams to map out the inter-relationships.


                    wpe1A.jpg (17785 bytes)

It is important to remember independent variables should be at the same level of causality, NOT on each other’s causal pathway (e.g. water supply and diarrhea), which will cause difficulty in interpreting the analysis of the model. For example, since poor water supply can cause diarrhea, which leads to malnutrition, the diarrhea is an intermediate in the causal path. It is important to look to redesign model to avoid using intermediates, and to identify possible confounders.

You might first take a look at how measles and age behave in a regression analysis.  Try this exercise:

1.  Open keast4j.sav

2.  Click on Statistics, Regression, Linear and enter waz in the Dependent Variable box.

3.  Enter age in the Independent variable box and click on OK.

4.  Try this again, changing only step 3 to include a run with hmeasyn (measles immunization)alone.

5.  Then try step three with hmeasyn and age

6.  Finally, do step three with both hmeasyn, age, and an interaction (MEAS_AGE) variable.   (Remember that you will need to create a new variable - an interaction variable by multiplying both variables of measles and age together.)


The summary of results will look something like this.

Dependent Variable WAZ Score

Regression Models in Age Strata: coefficient (t*,p)

Variables in the Model 1
Measles -0.513
(-4.923, 0.000)
_ -0.333
(-2879, 0.004)
(-4.800, 0.000)
Age _ -0.014
(-5.338, 0.000)
(-3.426, 0.001)
(-5.111, 0.000)
MEAS_AGE _ _ _ 0.024
(3.822, 0.000)
Sample N 679 698 679 679
Adj. R squared 0.033 0.038 0.048 0.067

*t-statistic – similar to an F-statistic used to compare the difference in two means, but a t-test is computed especially to adjust for a small sample size.

The table shows that there is a significant association for both measles and age with the outcome of nutritional status, and additionally there is interaction between age of the child and measles immunization.   One might find that reasonable, since interaction means that there is a difference in one independent variable depending on the level of another independent variable, and it is likely that a 1 month old child will have less likelihood of having measles immunization than a 14 month old child.  The interaction is left in the model to account for its significant contribution, but still the final model with measles, age, and the interaction still seems suspicious.  It is expected, or known, that typically the nutritional status of the child declines over the first 24 months of age when a child begins to take in complementary foods, becomes exposed to many new pathogens in the environment, and begins to develop their own immune response system. Because of this, it is not unbelievable that the coefficient for age is slightly negative (B= -0.027).  But the measles immunization coefficient is also still negative, even after controlling for age and the interaction term.  It just isn't believable that children who are immunized are more likely to be malnourished.  Therefore, you suspect there is strong confounding with age and that another tactic, stratification by age, should be used to control for this.



First it is nice to see how immunization and WAZ behave in smaller age categories by viewing a graph. This also helps decide how to stratify.

TAKE A LOOK at  how to create a graph of Measles Immunization and WAZ Score.

wpe7.jpg (20303 bytes)

Try looking at several age groups separately using the select if method (filtered) and then run regression analysis to see what happens with the effect on nutritional status. First select the children that are 6-11 months and then try those that are 12-24 months, then 12 -36 months and compare the results with the entire group at 0-60 months.  Here is a summary table of the regression results for separate age groups.


1.  Open keast4j.sav

2.  Click on Data, Select Cases, If condition is satisfied and click on If…

3.  Enter age into the expression box (for select cases if…) and type in > = number (e.g 6) and age <= number (e.g. 11), so that it reads: age >=6 and age <=11

4.  Click on Continue and OK.

5.  Click on Statistics, Regression, Linear

6.  Place waz in the Dependent Variable box and hmeasyn, age, and meas_age (interaction variable) in the Independent Variable box

7.  Click on the Enter Model and Click OK.

8.  Now reselect age groups 12-24 and complete steps 5-7.

9.  Reselect for age 12-36 and run the regression.

10.   Then place all age groups into the regression model and complete steps 5-7.

9.  If any of the models have an interaction term that is not significant, (has a p>1.0) then the model needs to be run without that interaction term in the equation.  Rerun the regression with just meas and age.


Results of a series of age restricted regression analysis with WAZ as the outcome and measles immunization and age (including an interaction variable) as determinants.


Dependent Variable WAZ Score

Regression Models in Age Strata: coefficient (t*,p)

Variables in the Model 1
AGE= 6-11 mo.
AGE= 12-24 mo.
AGE= 12-36 mo.
AGE = 0-60 mo.
Measles 0.244
(0.684, 0.496)
(2.901, 0.004)
(2.312, 0.022)
(-4.800, 0.000)
Age -0.366
(-3.915, 0.000)
(0.738, 0.462)
(1.798, 0.073)
(-5.111, 0.000)
MEAS_AGE -- -- 0.049
(-1.789, 0.075)
(3.822, 0.000)
Sample N 76 132 273 679
Adj. R squared 0.173 0.050 0.017 0.067

*t-statistic – similar to an F-statistic used to compare the difference in two means, but a t-test is computed especially to adjust for a small sample size.

It appears from the results that there is masking of the effect of measles immunization in the large 0-60 month age group, where B is negative (B coefficient= - 0.875). This is interpreted so that for each increase in nutrition status by unit there is a decrease in the percent of measles immunization. Since this is so highly unlikely, the most reasonable explanation is the influence of age on both immunization and on nutritional status or CONFOUNDING. Stratifying   has shown us that there is a protective effect of measles immunization that is strongest in the children between 12-36 months of age. The younger group, age 6-11 months, does show a positive association between measles immunization and nutritional status but it is not significant. Breaking has shown that you can see more of the story when you look within different groups.  For further analysis for measles immunization and nutritional status it might be useful to look within these age groups to see the effects without confounding by age.


Here are what you will see for the individual output that was used to create the table.  Only the output for age group 6-11 is shown below as an example of how the SPSS output will look on your screen.


First, children 6-11 months

wpe1.jpg (26452 bytes)



wpe3.jpg (8002 bytes)

wpe4.jpg (13419 bytes)

wpe5.jpg (15469 bytes)

Now that the age group is confined to 6-11 months of age, the sign for the measles immunization has changed from negative to positive (as compared to the entire 0-60 month group). This indicates that as measles immunization increases so does nutritional status of the child. Although there are only approximately 26% of those in this age group that are immunized, it is expected that there will be an improvement in the nutritional status of children due to either protection from illness (measles in this case), or possibly just because the child is seeing a health worker more often and the mother is more aware of her child's health status. There is a 0.244 point increase in the percent with measles immunization for every unit increase in the WAZ score according to this regression analysis. Although the P-value is not significant (<0.05) and there is a large possibility with this small sample size that it might be by chance that we see this result. It is not of of lesser interest that we see a significant result in this age category anyhow, since a child is usually not encourage to have measles immunization until about one year of age when they become higher risk. This was for comparison sake and to see how the results differ between different age categories.



1. Data, Select cases.., All cases, OK

Return to Top of Page