Outliers


Return to Data Cleaning

 

Exercise for Scatterplot of Prevalence of low arm circumference of males to females:

1.  Open the bdeshd1.sav file.

2.  Go to the Graphs menu and click on Scatter.

3.  Click on Simple and then Define.

4.  Select ACPRVM for the Y-axis and ACPRVF for the X-axis.

5.  Click OK.

ex5_1.jpg (28183 bytes)

INTERPRETATIONNotice the association between the two variables - as the prevalence of low arm circumference for females increases (x-axis), the prevalence of low arm circumference for males also increases - in general. However, you can see the outliers - one in particular which do not fall into the general pattern of association between the variables.  It would be much easier to visualize this relationship if these outliers were either set to missing or to the mean value of the group.


Here is what a scatterplot with a perfect association between the variables would look like:

wpe7.jpg (13300 bytes)

INTERPRETATION:  Notice the perfect linear association between the two variables. This will almost never happen in reality - if it does, this may be a sign that this data is faulty!

Note that this graph was produced by selecting only those cases of ACPRVF and ACPRVM in the bdeshd1.sav file that had matching values and then plotting them. You do not need to know this procedure.

Return to Top of Page