Multivariate linear models are, in one sense, like repeated measures models -- in both cases, it may be helpful to think of the analysis as involving multiple dependent measures. In repeated measures analysis, the multiple dependent variables are the same measure, repeated (usually over time). In multivariate models, the multiple dependent variables are measures of multiple outcomes, usually measured with different metrics, and usually measured at the same point in time. For example, a repeated measures analysis might be used to analyze weight measured at three successive months; a multivariate analysis might be used to measure weight, height, and girth at a single point in time.
Obviously, it is possible to conceive of an analysis that is both multivariate, and involves repeated measures (e.g. examining weight, height, and girth at each of three successive months).
One might approach multivariate outcomes by examining each outcome separately. There are dangers with this, however. One problem is that we are doing, in effect, multiple comparisons on the same data, and our power is not all that it appears to be for any one test. More importantly, we usually expect the multiple outcomes to be correlated, and we may wish to determine if the outcomes are affected by treatment factors independent of other outcomes.
The general strategy is to first perform an overall test of the effects of X on all Y simultaneously. If there is a global effect on all Y, then one proceeds to examine each Y.
This chapter gives a sampling of some fairly typical variations of multivariate analyses.
return to the chapter table of contents
Students in the classes of three teachers are each given two exams.
One-way analysis of each exam (score1, score2) by teacher (teach) finds no significant difference for either.
Repeat the analysis, but add the command:
manova h=teach / printh printe; This prints the hypothesis and
error variance covariance matricies. Also produced is a partial
correlation matrix, which the correlation of the errors of the two tests,
controlling for the independent variable teacher. In the example, we see
that the two dependent variables are highly correlated, controlling for teacher
(.94).
The roots and vectors results show the first factor of the dependent variables has 92% of the variance. Alternative tests for a teacher effect on the first root are all significant -- even though the separate tests for each dependent variable are not.
return to the chapter table of contents
This is a different application than the previous section.
Suppose we have multiple supposedly parallel measures of some variable (e.g. repeated tests, sub-tests, etc.) and we want to know if the mean scores across the tests differ. We can't simply do multiple pairwise t-tests, because these don't control for the fact that it is the same subject.
model test1 test2 test3 test4 = / nouni; this identifies four dependent
variables, but suppresses the univariate analysis.
manova h = intercept; identifies the intercept as the effect of interest
for the manova test.
m = test2-test1, test3-test1, test4-test1
mnames=diff2 diff3 diff4 / summary; creates a new variable m which has
three differences. If the hypothesis is supported, each of these should be
zero -- that is the means for the four tests are the same.
Manova produces one root, tests the overall equality of the means, and then
tests the difference of each mean against the reference mean.
return to the chapter table of contents
This example shows three related outcomes (weight of seed, bract, and lint of cotton bolls). There are two treatment variables: two varieties and two spacings-- which are crossed, with five plants in each of the four conditions. Two bolls are picked from each plant. The error term for testing the effects of variety and spacing is the variation across plants (not the total variation across bolls). Test is used to specify the correct error in univariate tests, this error term is specified in the manova command, as well.
return to the chapter table of contents
Returning to the earlier Oranges data, the quantities sold of two competing varieties of oranges are analyzed as a consequence of the price of each. Observations are made across stores at days of the week. Here, manova is used to test the hypotheses of all four independent variables (including both the class variables -- store and day, and the continuous covariates -- prices of each variety).
The error SSCP and partial correlation matrices are useful -- showing correlated errors of .139, then providing tests the multivariate effect of each variable.
return to the chapter table of contents
Contrasts to test hypotheses about the multivariate effects can also be constructed. That is, specific hypothesis tests about the effects of X on the mean of the vector of the dependent variables can be tested.
return to the chapter table of contents
The multivariate model is an extension of the univariate model. The dependent variable is now a vector of Y. The X matrix is the standard subjects by variables scores on X, the B matrix is now m by k -- that is a vector of effects for each dependent variable. The error term U, is now n by k -- that is residuals for each individual on each dependent variable.
What is analyzed in multivariate tests is the first eigenvector, or the factor weighted linear combination of the dependent variables.
A function of the eigenvector (gamma / (1 + gamma)) is the cannonical correlation (squared), or the proportion of the variance in the vector of dependent variables that is accounted for by the independent variables.
Multiple criterea of accessing goodness of fit are provided, and are more or less useful depending on the error term appropriate to the particular model being studied.
return to the chapter table of contents
Outputs 9.1 through 9.3, data set teachers2.sas
Outputs 9.4 through 9.5, data set rats.sas
Outputs 9.6 through 9.9, data set cotton2.sas
Outputs 9.10 through 9.11, data set oranges2.sas
return to the chapter table of contents