The notes below are organized parallel to Littell's text:
hypotheses about one or two means are usually examined with t-tests PROC MEANS, PROC TTEST. These are a special restricted case of one-way ANOVA, which is more generally performed with PROC ANOVA or PROC GLM.
Return to the chapter table of contents3.2.1 One sample statistics: PROC MEANS can be used to get sample estimates xbar of the population mean mu; s as an estimate of the population std. dev., the estimated standard error of the mean, t (xbar/se), and p and CI. per the example in the data set PEPPERS.
The default null hypothesis is that the mean is zero. To test that the mean is different from some other value, subtract that value from each score, centering the data on the test value (e.g. if you want to test that the mean differs from 3, subtract 3 from each observation).
In SPSS, the frequencies or descriptives procedures do single means, best, however, is the examine procedure that has good graphical displays as well as basic statistics and ttest.
3.2.2. Two related samples. The two means to be compared may be "before and after" for the same subjects, or they may be matched observations in two treatments (randomized blocks - where the blocks are the pair of matched subjects, and one of each pair is randomly assigned to each of two treatments).
Paired means two-sample tests test whether the mean of the differences of the two scores in each block differs from zero. That is, e.g. the pretest is subtracted from the post test to create D, and then one-sample tests are applied to D. An example is done in SAS using the data set PULSE. PROC MEANS can be applied to a computed D, or PROC TTEST can be applied using the paired "pre"*"post" command.
In SPSS, one can compute the difference variable and apply descriptives, frequencies, or examine. Alternatively, one can use Analyze, compare means, paired-sample t-test.
3.2.3. Two independent samples. Test the difference between the means of two independent samples. The pooled variance estimate should be used when it is reasonable to assume that the population variances of the two groups are similar; alternatively, separate sample estimates can be used (more conservative). The Folded-F statistic (larger sample variance/smaller sample variance) can be used to test the hypothesis that the variances are equal. Using PROC ANOVA or PROC GLM assumes equal variances. An example is given using the data set BULLET, which examines the velocity of rounds fired with two different types of powder.
In SPSS, use Analyze, compare means, independent samples t-tests. This provides the F test for equal variances, along with the two ttests and CIs.
The null is that the difference is zero. To test that the difference is greater than some quantity, you can do it by hand or subtract the quantity from one variable.
Return to the chapter table of contents
In SAS, ANOVA and GLM are specific for anova, ttest, nested, varcomp ae used for particular kinds of analyses. PROC ANOVA assumes balanced or orthogonal data sets, GLM is more general and ANOVA is rarely used. MIXED does all GLMs, but doesn't offer all of the specific output that GLM does. MIXED is replacing GLM. Here we focus on GLM. Chapter 4 introduces the mixed model and PROC MIXED,
3.3.1 Terminology and notation. Sums of squares for factors, sums of squares residual, and sums of squares total. DF for a factor are number of levels in the classification less one (number of groups on X, less one). DF are also partitioned for factors and residual. Mean Squares are Sums of squared divided by df. F ratios are MS factor/MS residual with (df1, df2) to test the null that the SSnumerator is zero.
All analyses in this chapter assume fixed effects.
Sums of squares for a single classification or treatment or X are simply the within and between groups SS. Where there is more than one classification (factor, independent variable), the SS depend on whether the classifications are crossed or nested.
3.3.1.1 Crossed classifications and interaction SS. Assume that there are two independent variables A and B which are crossed (each level of B occurs for each level of A). If the differences between any two group means on factor A is the same across all levels of B (and the necessary converse that the differences in any two B group means is the same across levels of A), then there is no interaction. But, if the differences in A depend on which level of B (or, same thing, if the differences in means on B depend on which level of A) then there is interaction.
The interaction sum of squares sums across all cells of A*B. For each cell, the SS is the cell mean minus the relevant mean of A, minus the relevant mean of B, plus the grand mean. Or, it is the cell mean minus the main effect of A minus the main effect of B. The interaction SS has (a-1)(b-1) df.
3.3.1.2 Nested effects and nested SS. Nesting occurs where different levels of the variable B occur within levels of A. For example, if B were 5 curricula and A were 10 schools, and we were interested in student's reading scores; and different sets of curricula were introduced at different schools... Annotated SS[B(A)], to indicate variation across levels of B that occurs within each level of A (summed across all As). SS are calculated as mean of cell deviated from mean of factor, squared, weighted, summed across all cells. This measures variation across the treatment levels within each context, then pools across the contexts. One common example of nesting is where the same subject is used across several levels of treatment -- a "repeated measures" approach. Here, the scores on the outcome depend on the treatment, but the treatments are nested within individual subjects.
3.3.2 Using ANOVA and GLM. The Class and Model statements are required. Means produces a table of means corresponding to the specified effects, and does protected t-tests. Test is used to substitute some other SS for SS residual in tests. Manova and repeated produce specialized output for multiple dependent variables (chapter 9) and repeated measures (chapter 8). Contrast and Estimate test combinations of means, and estimate the effects of these contrasts. LS means are used to calculate effects for unbalanced data. Try running BULLET (one-way, two levels) in PROC ANOVA and PROC GLM. Look at SPSS analyze, means, one-way ANOVA; also analyze, generalized linear models.
3.3.3 Multiple comparisons and preplanned comparisons. F tests whether the means (or adjusted means) on A differ. We often want to know which means differ from which others (where there are more than two levels of A). Various post-hoc multiple comparisons can be done that provide different protections against experiment-wise error. These comparisons map which means differ from which others. If one has a specific prior hypothesis about which mean or means ought to differ from which other mean or means, a test of this specific hypothesis about the multiple means can be implemented with a contrast. The tests of contrasts assume that the comparison in question was planned -- not a post-hoc summary or simplification of all multiple means.
Important: think of each level of X (each group) as a sample from a population stratified by the grouping variable.
assume equal numbers of cases in each group, normal distribution in each group, variances the same across groups. Express as a linear model: Yij = Mi + eij the score of an individual is equal to the score of the mean of the group, plus an error. This is the means model. Alternatively, an effects model expresses Yij = M + Ti + eij; the score of an individual is equal to the grand mean, plus a deviation of the group mean from the grand mean, plus individual error.
Estimates are quite robust against violation of normality and equal variances, but these should be diagnosed with univariate statistics and plots. Failure to display equal variance and normality can sometimes be fixed by transformations. Or, one may want to use a generalized linear model that assumes a different distribution of Y.3.4.1 Computing the ANOVA table. An example using the data set VENEER, describing the degree of wear of samples of synthetic wood veneers from five manufacturers is given. Analyzed with GLM, including a test for homogeneity of variance. In SPSS, the same analysis can be performed using analyze, general linear model -- treating the classification variable (brand) as a fixed factor.
3.4.2 Computing means, multiple comparisons, and confidence intervals. Example of printing means, tests for differences with LSD method, and confidence intervals. This can be done with options in SPSS GLM.
3.4.3. Planned comparisons (contrasts) for one-way classification. One way custom comparisons can be done in SPSS analyze, means, one-way ANOVA, but the factor must be coded as a numeric variable with sequential valued levels. In SAS, GLM is used and each contrast is labeled, the variable involved given, and then the vector of weights.
3.4.4 Linear combinations. The sum of the weights in a contrast should be zero. Contrasts for one-way classifications are straight-forward because they are defined in terms of means. For more complicated models, though, GLM defines the contrasts in terms of effects, which can be more complicated. SPSS GLM has a number of common contrasts (sequential, adjacent, linear and polynomial trends) built in; but does not seem to provide a simple way to specify custom contrasts.
3.4.5 Testing several contrasts simultaneously. A planned comparison can be more complicated and involve more than one equation or degree of freedom. Suppose we wanted to test the idea that three (of five) means were the same. This is translated into testing 1 versus 2 and 1 versus three simultaneously 1 -1 0 0 0 and 1 0 -1 0 0. These are specified as separated with commas on the contrast command, and done as a single test with two degrees of freedom.
3.4.6. Orthogonal contrasts. When testing multiple hypotheses about a set of means, it is useful if the tests are statistically independent, so that the degree of freedom used for one test can be "re-used" for another. Any two contrasts are orthogonal where (each contrast sums to zero, and...) the sum of the products of the coefficients across the two contrasts is zero.
3.4.7 Estimating linear combinations of parameters with the
ESTIMATE statement. If one wants, for example, the value of the difference
between two means, as well as a test of whether this difference is zero, you
could use:
estimate 'ACME versus AJAX' brand 1 -1 0 0 0
This yields the difference between the first two means (effects), a standard
error, t and p level. The weights selected do affect the value of the
contrast; they do not affect the significance of the test that the value of the
contrast is zero.
It is useful to think about contrasts not as differences in means, but as
differences in effects. Suppose that one wanted to estimate the average of
the means of the first three (of five) groups, and see if that value differed
from zero. Think about each mean as an effect, or a difference from a
grand mean (or intercept). Then:
estimate 'US Mean' intercept 1 brand .333 .333 .333 0 0;
this says to include the grand mean and then equally weight the effects of the
first three groups (and ignore the last two). The resulting estimated
value of the contrast is the grand mean plus the sum of 1/3 of the first effect
plus the sum of 1/3 of the second effect... a proper standard error and t
test is performed by GLM.
Return to the chapter table of contents
The complete randomized blocks design is one of the most common and most important basic designs for "quasi" experiments. Blocks are very often multiple locations at which research is being conducted simultaneously on the effects of X on Y. In a complete design, each level of X is implemented in each location. For example, suppose that we had five reading curricula (X) and we thought these differed in their effects on reading comprehension (Y). We implement all five curricula in each of 8 schools. We are interested in comparing the five means; the 8 schools are needed to find subjects, but may differ in ways that we don't care about -- but which also may be related to reading comprehension (e.g. social class differs by school).
The randomized blocks design lets us partition out the block-wise variation (e.g. school to school variation), and then to test for differences in the treatment means against the reduced error sums of squares. It is the most efficient such design because there is one trial of each treatment in each block, which makes the blocking variable and the treatment variables orthogonal while allowing sufficient observations of each to estimate means.
3.5.1 In the example data set METHODS, five methods of irrigation (basin, flood, spray, sprinkler, trickle) are implemented in each of eight locations (e.g. portions of a field that may have similar light and soil conditions), and the weight of fruit from each of the forty observations is recorded.
The analysis of this is extremely simple. Use GLM in either SAS or SPSS. Specify both bloc and treatment as independent variables. Use type III SS (simultaneous). Specify no interaction effect.
In the ANOVA summary table, note that the effects of the control variable (bloc) are removed from the error SS. So, the tests for differences due to treatment (method), exclude the block variation. Very simply, we are controlling for mean differences across blocks in testing for differences due to method.
3.5.2 Additional multiple comparison methods. Contrasts could be used. But if we have no prior theory, we may want to do protected comparisons. Different methods (e.g. LSD, Duncan, Tukey, Waller) are more or less sensitive to type I errors (incorrectly concluding that there is a difference of means) and type II errors (incorrectly concluding that there is no difference).
3.5.3 Dunnett's test. This multiple comparison is best when one group is considered a control group, and we are interested in testing other groups against it.
Return to the chapter table of contentsThe latin square design is closely related to the full randomized blocks design, but has two blocking factors rather than one. The two blocking factors are orthogonal to one another by design -- each level of each blocking factor occurs exactly once with each level of the other blocking factor (e.g. if each blocking factor has 3 levels, there are then 9 unique combinations of blocking factors). The treatments are applied in such a way as to be orthogonal to the combined blocking factors (i.e. each treatment occurs once at each level of each blocking factor). In this way, the two control variables and the treatment variable are all orthogonal, and a single observation per cell may be used.
The text example (see data set GARMENTS.sav) involves two outcomes -- weight loss and shrinkage of four materials (treatment). Four heat settings are applied (position) on each of four runs of the experiment (run). Each material is placed in a position on each of the runs, orthogonalizing the design.
The problem is run in SAS GLM using the same simple model statement, but with the two dependent variables listed on the left hand side and no interaction on the right hand side. In SPSS analyze, GLM, multivariate needs to be used, and the model should be specified to exclude interaction (the default is full factorial). The SPSS results also, as a result, give multivariate tests, as well as the one-way tests for the two outcomes.
Return to the chapter table of contentsChoosing a design (e.g. random blocks) is primarily a matter of controlling error - experimental design. Choosing treatments is has to do with the hypothesis to be tested, and is often called the treatment design. Factorial treatment designs are among the most widely used, and can be combined with almost any error design.
A factorial experiment has several factors, and levels that include all combinations of all levels (e.g. two variables with two levels each gives rise to four types of treatments).
Factorial designs allow the assessment of: simple effects: effects of A holding B constant; interactions: differences in the effects of A across levels of B; and main effects: the overall effect of A averaged across levels of B.
An example is provided in which there are two treatment variables -- five varieties of seed (variety) and three growth methods (method), so there are 15 conditions. In this example, six replications (pots) are done in each condition, and the dependent variable yield recorded for each of the 90 (15x6) observations. The data set is called GRASSES.sav.
3.7.1 ANOVA for the two-way factorial. One issue is the presence of interaction. If interaction is present, then simple effects are used to evaluate the treatments; if interaction is absent, the average or main effects are appropriate. It is useful to plot the means for each of the 15 treatments -- say as five means for varieties for each of three methods. SAS PROC GLM or SPSS analyze, GLM, univariate can be used to get sums of squares, F tests, etc.
Note that the example is a balanced design, so type III sums of squares are the same as type I. This is not so if the data are unbalanced.
3.7.2 Multiple comparisons. In SAS, the means statement can be used to compare and test effects for each factor when there is no interaction; the LS means statement must be used where there is interaction. This approach amounts to comparing simple effects, or looking at the means for, say, treatment, within each variety. Simple and powerful methods in the lsmeans command in SAS allow F tests for effect of A at each level of B. In SPSS, there is less control, but most comparisons (but not tests) look to be possible. In SPSS, select either the main effects model or the interaction model, and estimate it. Then ask for post hoc comparisons. Where the interaction model is selected, post hoc comparisons are automatically done on the levels of A for each level of B (but not for the main effects).
3.7.3 Multiple comparisons of Method means by variety. This section shows how to get tests for all possible mean differences of simple effects of A (for each level of B) and for B (for each level of A) using the lsmeans command in GLM. PROC MIXED is introduced to do the same job, and has the same syntax.
3.7.4 Planned comparisons (contrast and estimate). The purpose of this and subsequent sections is to show how to set up contrasts to get effect estimates and F tests in the case with multiple treatments. Think of the score of an individual as their cell mean plus individual error or residual (ss within). Think of the cell mean as the sum of a simple effect of A plus a simple effect of B, plus an interaction effect of the specific combination A*B in that cell.
Contrasts are specified in terms of model effects. To create the weights you want, first state the linear combination in terms of means, then convert the means to model parameters.
3.7.5 Simple effects comparisons (that is, writing a contrast to look at differences in the means of A within each level of B. For example, compare the mean for the first variety versus the other two. Too tedious to reproduce.
3.7.6 Main effect comparisons (that is, contrasts on the average effects of A across all levels of B).
3.7.7 Simultaneous contrasts for two-way classifications
3.7.8 Comparing levels of one factor within sub-groups of another.
3.7.9 An Easier Way to set up Contrast and Estimate Statements. Shows the design matrix, which can be used to map out contrasts. SPSS has built in functions for simple contrasts on either main effects (if the model specified does not include interaction) or simple effects (if the model specified has interaction). Custom contrasts do not appear to be available. SPSS will, if requested, show the design matrix.
Return to the chapter table of contents