Page 3

Using the Variables


Know the data SOURCE- how were the measures taken and are they reliable? Was the data collected in another language and translated (data loss in the process)?
Know how to HANDLE the data - identify categorical (discrete) variables and continuous variables (see definitions) and identify the file structure and VARIABLE LABELING system, as well as recoding (see variable labeling link).
Know how to CLEAN the data- identify and correct data errors and cope with errors that are beyond correction.
Know how to TRANSFORM the data- often variables maintain a higher quality if the question posed during the interview is chosen as one that will most likely elicit a reliable response. If the response is not the most easily used for analysis, it can be transformed later to provide the information needed. Further explanation on types of transformation and exercises in transforming is shown on Page 3 of Nutrition Data.


Most Useful Most Reliable
Age in Months Date of birth (DOB)
Date of interview (DOI)
Solution: Transformation after data collection
DOI - DOB = Age in months

For example, date of birth and interview dates can be used to calculate a child's age through a transformation and this would usually be more accurate (especially if there is documentation) than asking the mother how old the child is in months.

CLICK HERE to go to the next page on Transformations


Know the LANGUAGE of the data- the data analysis gives many shortcuts for communication, with conventions, drawings, and symbols. Use, get familiar, but query any phrases that seem unclear.


1.  Outcome = dependent, goes on y-axis, goes in the cells of tables and on the left hand side of equations.

2.  Classifying, determining = independent, goes on x-axis, defines columns (or rows in 2-way) tables, and is on the right-hand side of equations.

3.  Scatterplot and draw lines for the regressions, although remember that the correlations are highly dependent on the sample size (N).

4.  Regression analysis gives further syntax including residuals, interactions, and controlling. Really, it is a lot easier than it sounds.

Return to top