General Comments: The semester is roughly divided into thirds
during which the following topics are discussed: Nonparametric Statistics;
Multiple Regression Analysis applied to experimental data (General Linear
Model); Special Topics. Homework
problems are done throughout the semester, and the final is a take home exam.
No textbook is required but readings will be suggested throughout the course.
I.
NONPARAMETRIC (DISTRIBUTION-FREE)
STATISTICS
1. An
example of a nonparametric test - Tukey's Pocket Test
2.
Permutations and Combinations (Learning how to count)
A. No. of permutations = N!
B. No. of combinations = N!/(n1!n2!...ng!)
3.
Randomization Tests - The General Concept
A. Count the number of possible outcomes
given the data, the
experimental design, and that Ho is true.
B. Count the number of outcomes as extreme
or more extreme than
the one that actually occurred.
C. The probability of the actual outcome
resulting from chance
is
equal to B./A.
4.
Tests for Two Independent Groups
A. Fisher's Two Sample Test
1)
The requirement of random assignment
2)
Symmetry for equal n; asymmetry for unequal n
3)
PSLC:RTGRP
4)
Approximate vs. exact randomization tests
5)
Power relative to corresponding parametric tests
6)
The null hypothesis tested
B. Substituting ranks for raw data -
Wilcoxin's Rank Sum Test
1)
The sampling distribution & the reason for ranks
2)
Properties of ranks
3)
Stevens NOIR nonsense
4)
Other equivalent tests: Mann-Whitney U; Festinger's
test; Van der Reyden's test; Haldane &
Smith's test
5)
Dealing with ties
6)
Loss of specificity of Ho:
7)
The asymptotic distribution - parametric or
nonparametric?
8)
PSLC:WILCO
5.
Tests for Repeated Measures (or Matched Groups) - Two Measures
A. The Randomization Test for Raw Data
1)
Counting equals 2^n
2)
Difference scores and algebraic signs
3)
PSLC:RTRM
B. Use of ranks - Wilcoxin's Signed Rank
Test
1)
The sampling distribution
2)
Dealing with zero difference scores
3)
Power & Ho:
C. Considering only direction of
difference - the Sign Test
1)
The Binomial Distribution - Pi=0.5
2)
Power
3)
The asymptotic distribution - z or Chi-square
4)
An equivalent test - McNemar's Test
5)
PSLC:MP
6.
Important Points So Far
A. Asymptotic Relative Efficiency (ARE)
Bradley p. 60-61.
B. Use of asymptotic distributions for
rank tests
C. Loss of specificity of Ho when going to
ranks
D. Two stages of randomization in
experimental design
1)
Parametric tests tacitly assume both
2)
Nonparametric test assume only random assignment
E. Ranking as a transformation
1)
Ranks contant for monotonic transformations of raw data
2)
Power of rank tests with skewed data
3)
Power relative to F after normalizing transformation
4)
Nonreversibility of the rank transformation
5)
Power with "outliers"
7.
Generalization to More than Two Groups or Measures
A. Randomization test for 3 or more groups
1)
The classic problem of representing differences
among
3 or more means
2)
Counting becomes horrendous
B. The rank test - Kruskal-Wallis H
1)
The test term
2)
The limited tables
C. Randomization test for 3 or more
repeated measures
D. The rank test - Friedman's Test
1)
Limitation of tables
2)
Sacrifice of power
3)
With 2 variables simplifies to the Sign Test
4)
Cochran's Q
8.
Correlation
A. The Randomization Test - Pitman's Test
1)
Counting equals N!
2)
Sum of cross products simplifies counting
3)
PSLC:RTR
B. Use of ranks
1)
Spearman's rho
2)
The sampling distribution - Hotelling-Pabst's Test
3)
The asymptotic distribution
C. Kendall's Tau - A coefficient based on
the concept of order
9.
Contingency table data
A. Expressing the data as columns of 0's
and 1's
1)
The randomization test
2)
The limiting distribution - chi-square or F?
B. Expressing the data in a 2 x 2 table -
Fisher's "Exact" Test
1)
The meaning of fixed marginal frequencies
2)
The hypergeometric distribution
a)
The derivation from counting
b)
The derivation from the conditional binomial
3)
Three experimental models
4)
The median test
5)
Class action suits
6)
PSLC:FET
7)
More than 2 groups PSLC:HYPER
10.
Normal Scores Tests (Why use ranks?)
A. Using ordered expected normal scores
rather than ranks
B. Power relative to F - especially with
skew
C. Inverse normal scores vs. expected
normal scores
D. Random normal scores - a mind boggler
11.
Further Topics
A. Difficulty of dealing with interaction
B. Directional vs. nondirectional
hypotheses
C. Tests for runs
II. REGRESSION
ANALYSIS
1.
Example: 1 Way - 2 Group ANOVA done with Correlation
A. The Dummy Predictor
1)
Only discriminates between groups
2)
Doesnt discriminate scores within groups
B. Because there is only 1 df between
groups only 1 dummy
predictor
is needed
2.
Multigroup 1 Way ANOVA
A. Need as many predictors as df between
groups
B. Dummy predictors can be modeled after
orthogonal coefficients
C. Multiple regression is required
1)
Formula for the F-test
2)
PSLC:REGRAN
D. Nonsense predictors also will work --
Why?
3.
Factorial ANOVA
A. Dummy Predictor for Each DF
1)
Dummy predictors for interaction
2)
PSLC:COEFF
B. Complete vs Reduced Model Tests
1)
Formula for F-test
2)
Necessary for correct error df
3)
Even more important for Unequal n
C. Unequal n - Nonorthogonal ANOVA
1)
Basic problem can be seen in R-matrix of dummy
predictors
2)
Overall & Speigals 3 Models
3)
Why only Model 1 is correct
4)
Least squares vs Unweighted Means
4.
Analysis of Covariance - ANCOVA
A. Covariate is included in both complete
& reduced model
B. Covariate controlled in same manner as
overlap
is controlled in unequal n
5.
Confounded Designs and Regression Analysis
A. Determining what is confounded from the
R-matrix
B. Example - Latin Square
6. Repeated
Measures Designs
A. Removing subject effects with subject
totals
B. Degrees of Freedom can get very tricky
7.
Including Continuous (instead of discreet) Independent Variables
A. If discreet, how many levels?
B. If continuous, in what form?
C. Representation of interactions
D. Moderated Multiple Regression
(Moderator Effects)
8.
Polynomial Regression - and Surface Fitting
(tie this in with testing for interaction
with continuous
variables)
III. SPECIAL
TOPICS
1. Bayesian
Statistics
2.
Curve Fitting
3.
Missing Data in Repeated Measures
4. Life
Table & Survival Analysis
5.
Sequential Analysis
6. Time
Series Analysis - ARIMA Models
7. Log
Linear Models
8.
Correlation Coefficients
A. Those based on Pearson's r
B. Those based on Kendall's Tau
C. Others e.g. based on Information Theory
9.
Distributions of Various Statistics
References
Bradley,
J. V. (1968). Distribution-free statistical tests. Englewood Cliffs, NJ: Prentice-Hall.
Cohen,
J., & Cohen, P. (1975). Applied
multiple regression/ correlation analysis for the behavioral sciences. New York: Wiley.
Dunlap,
W. P. (1972). Three subroutines for dealing efficiently with permutations.
Behavior Research Methods & Instrumentation, 4, 159-160.
Dunlap,
W. P., & Brown, S. G. (1983). FORTRAN IV functions to compute expected
normal scores. Behavior Research Methods & Instrumentation, 15, 395-397.
Dunlap,
W. P., Myers, L., & Silver, N. C.
(1984). Exact multinomial probabilities for one-way contingency tables.
Behavior Research Methods, Instruments, & Computers, 16, 54-56.
Dunlap,
W. P. (1985). Hypergeometric tests for 2 x k contingency tables. Behavior
Research Methods, Instruments, & Computers, 17, 432-434.
Dunlap,
W. P., & May, J. G. (1989). Judging statistical significance by inspection
of standard error bars. Bulletin of the Psychonomic Society, 27, 67-68.
Edgington,
E. S. (1969). Statistical inference: The distribution-free approach. New York: McGraw-Hill.
Edgington,
E. S. (1980). Randomization tests. New
York: Marcel Dekker.
Overall,
J. E., & Spiegel, D. K. (1969).
Concerning least squares analysis of experimental data. Psychological Bulletin, 72, 311-322.
Overall,
J. E., Spiegel, D. K., & Cohen, J. (1975).
Equivalence of orthogonal and nonorthogonal analysis of variance. Psychological Bulletin, 82, 182-186.
Pedhazur,
E. J. (1982). Multiple regression in
behavioral research. New York: CBS
College Publishing.
Siegel,
S. (1956). Nonparametric statistics for
the behavioral sciences. New York:
McGraw-Hill.