Chapter 2

Page 4

M&E Home Page 1 Page 2 Page 3 Page 4


Non-Experimental Study Designs

Quasi-experimental designs  

Quasi-experimental designs are similar to experimental designs, except for the lack of random assignment, that is, study participants are not randomly assigned into treatment and control groups.  They are usually used in the evaluation of educational programs when random assignment is simply not possible or is not practical.  There are many different types of quasi-experimental designs some of which include control group time series design and non-equivalent control group design. With the control group time series design, the evaluator has a comparison group that is similar to the intervention group with the exception of the intervention.  Several measurements of the outcome variable of interest (for example, initial baby weight) are taken before the introduction of the intervention in order to establish a baseline.  Then the intervention is introduced to the intervention group and after the completion of the intervention, more measurements are taken from both groups.  Remember that it is only the intervention group that receives the intervention, but the evaluator also obtains measurements from the control group in order to have a basis for comparison. With the non-equivalent group design, there are two main types – the posttest only design and the pretest-posttest design.  With the posttest only design (refer to illustration), no measurements are taken before the start of the intervention and it is impossible to determine if the treatment group and the control group are similar before the administration of the intervention to the treatment group.  Since it is then possible for the control group to differ in ways that may influence the outcome of the evaluation, this is a very weak design and is not usually recommended.  

Non-equivalent posttest only design

insert figure 2.4.1 here

 

X = Program intervention or introduction

O1 = Outcome measure for treatment group

O2 = Outcome measure for control group

Impact is measured as:

 Impact = (O1-O2) +/- Error

If this looks familiar, it is because it is similar to the posttest-only randomized control experiment.  However there is no randomization for this study design. 

The pretest-posttest design differs from the posttest only design in that measurements are taken before the introduction of the intervention.  These “before measurements” help the evaluator in assessing the differences between the treatment group and the control group. They also help in establishing a baseline.  “Before measurements” make any differences in the final measurements obtained after the treatment group has undergone the intervention easier to interpret. Thus this method is far more accurate than the posttest only design. 

 Non-equivalent pretest-posttest design

 

X = Program intervention or introduction

O1 = Pretest outcome measure for treatment group

O2 = Posttest outcome measure for treatment group

O3 = Pretest outcome measure for comparison group

O4 = Posttest outcome measure for comparison group

 

The program impact is measured as:

Impact = (O2-O1) – (O4-O3) +/- Error (design and measurement error)

Staggered implementation design

This is a type of quasi-experimental design in which the intervention is introduced to different groups at different times.  The illustration below will help explain this design. For each of the three groups/clusters outlined in the illustration, it can be seen that the intervention is introduced at different times – for the first group, the intervention is implemented at year 0, for the second group the intervention is introduced somewhere around the 8th month after the intervention has already been introduced in the first group, and for group 3, the intervention is introduced somewhere around the 18th month after the intervention was originally introduced in the first group.  Note that for all three groups, regardless of at what point the intervention is introduced, baseline surveys are first carried out.  After the introduction of the intervention, for groups 1 and 2, mid-implementation surveys are carried out several months into the implementation of the intervention, before a final follow-up survey is carried is carried out at year 2.  The time at which the final follow-up survey is conducted, as well as the time periods for introducing the intervention in the different groups and conducting the mid-implementation surveys, must be pre-determined by the evaluator based on the type of intervention and the outcome measure.  For this type of study design, there are no real “control groups”, that is, each group receives the intervention.  However, because the intervention is introduced at different time intervals in each group, a relatively accurate comparison can be made between the three groups.

insert figure 2.4.3 here

 

Example of a staggered implementation design – Farrell AD, Meyer AL. 1997. The effectiveness of a school-based curriculum for reducing violence among urban sixth-grade students. American Journal of Public Health 87(6):979-84

 Please click here for the full article. 

The primary purpose of this study  was to evaluate the effectiveness of a school-based curriculum in reducing violence among urban sixth-grade students.  The goal of the curriculum was to reduce violence among youth by providing them with information on risk factors for interpersonal violence and teaching them the skills necessary for choosing alternatives to fighting.  The evaluation method used was a quasi-experimental design with comparison groups.  However, note that the curriculum was introduced to the study population using a staggered implementation design.  That is, the students were introduced to the curriculum during different semesters to allow for comparisons between those currently receiving the curriculum and those not receiving the curriculum to be made.  The use of the staggered implementation design ensured that every sixth grader in the participating school received the intervention, but at the same time an effective comparison could be made between treatment and non-treatment groups.  A total of 1274 students participated in the curriculum intervention, and of these, baseline data was collected from 1150.  After the introduction of the intervention, data was collected via the use of questionnaires at specific time intervals (beginning, middle and end of the school year).  The evaluation study was carried out on a sample of 978 of these students and the data was analyzed by comparing results from students who had received the intervention to those who had not yet received the intervention.  The primary outcome measure was the frequency of violent behavior, which was assessed using the Violent Behavior Scale from the Behavioral Frequency Scales.

 Stepped wedge design

Stepped wedge designs can be used either for randomized controlled experiments or in quasi-experimental designs.  The main distinguishing feature of this study design is the way in which participants receive the intervention.  With this type of design, participants receive the intervention sequentially, at pre-determined time intervals.  For randomized controlled experiments, the order in which the different groups/clusters of participants receive the intervention is randomly determined.  However, for a quasi-experimental design, there is no random allocation. Like the staggered implementation design, the intervention is not withheld from any group; however, the time at which each group/cluster receives the intervention differs.  All participants would have received the intervention by the end of the study.  Stepped wedge designs are usually used in situations where it would be unethical to withhold the intervention from the control group, especially if it is expected that the intervention will be beneficial to the study population (for example, a feeding program for malnourished children is expected to do more good than harm and thus it would be unethical to withhold the intervention from all malnourished children) or when it is not practical or is financially impossible to deliver the intervention to all participants at the same time.  An advantage of this type of evaluation method is that individuals or groups can act as their own controls since the time that each group receives an intervention varies.  The overall impact of the program or intervention is measured by comparing data from those currently receiving the intervention to those not currently receiving the intervention.   

insert figure 2.4.4 here

The Stepped Wedge trial design: A review of the literature, an unpublished article by Brown, Patil and Lilford, all from the University of Birmingham , Department of Public Health and Epidemiology, gives a comprehensive review of how stepped wedge designs have been used in the past for evaluating public health interventions.  For those interested in broadening their knowledge of this particular type of evaluation design, reading this article is strongly recommended.  It can be found at the following link:  www.pcpoh.bham.ac.uk/publichealth/ psrp/Pdf/Stepped_Wedge_Literature_Review.pdf  

Example of a stepped wedge trial design -

Design of the HIV Prevention Trials Network (HPTN) Protocol 054: A cluster randomized crossover trial to evaluate combined access to Nevirapine in developing countries ARTICLE CITATION NEEDED

 Please click here for the full article. 

The main purpose of this trial was to compare two strategies for providing a single dose of the drug nevirapine (NVP) to HIV-seropositive mothers and their infants, as a means of preventing mother-to-child transmission in the developing world. The first strategy, also known as targeted therapy, was designed to offer NVP only to women identified as HIV-positive through voluntary counseling and testing (VCT).  The combined strategy was designed to offer NVP to both women who are identified as HIV-positive through VCT, as well as to women who refuse VCT.  The study design was a clinical trial, using a cluster randomized stepped wedge technique aimed at identifying which strategy was more effective.  The unit of randomization was the prenatal care clinic.  The main outcome measure for this study was the proportion of HIV-positive women and their infants in the population who accept and adhere to the use of NVP.  Anonymous, unlinked cord blood specimens were collected from all participants in order to determine maternal HIV status and the presence or absence of NVP. Directly observed therapy was used to assess infant receipt of NVP.  Again as you read, make note of the problems encountered and the authors’ rational for their choice of study design.  Also, pay close attention to the authors’ explanation of the statistical design and how it applies to the NVP trial.