SAS CODES

Regression Projects


The analysis guildlines suggest certain research questions and analyses to help you to achieve the study objective. If you wish, you may propose your own questions and carry out somewhat different analyses. For the project you choose, you may work individually or choose to work with one other person and submit a joint project. If you choose to work jointly, both people will receive the same grade for the project and it is up to both partners to make sure that the work is shared equitably. Groups larger than two are not permitted. Your project should be submitted in the form of a paper (typed) which describes the problem, the analyses you carried out, and your findings. Your SAS results should appear as tables or figures, either in an Appendix, or in the text (for the most important ones). All output should be clearly labelled, and edited to remove non-essential portions. Do not simply tack on a pile of output through which the reader will have to wade to find the relevant analyses. As a general rule, do not include any output which you do not mention, describe or discuss in the text of your paper.

Regression Project Report

The quality of the written report on your analysis will figure prominently in the grading. You will be marked to some extent on what would be expected in a professional consultation or research report. You may choose to deviate somewhat from the following outline, but this is roughly what I expect. (See Ehrenberg, 1982, "Writing Technical Papers or Reports", The American Statistician, 36, 326-329, for further suggestions.)

  1. Abstract. This should consist of a brief statement (a paragraph) summarizing the purpose or goal of your study, and the results and what they mean, in substantive rather than statistical terms. As Shakespeare said, brevity is the soul of wit!-- be brief and to the point, stimulating your reader's interest.
  2. Introduction. This should include a clear statement of your view of the scientific question(s) of interest in connection with your data. The goals of the statistical analysis should be clear to all who read this section. Because there will be notable variability in the amount of background information available with the data sets, there will necessarily be variability in the levels of detail appropriate for the analyses. (Background reading in the applied subject area will be useful in some cases.)

    Begin with a statement of a problem or question to be investigated, then introduce the data.

    Following the introduction of the problem or question try to provide motivation for the analysis to follow:

    • What kind of data is relevant to studying the question?
    • Which variables might be expected to be relevant predictors? Why?
    A formal research study might be expected to state explicit hypotheses. You may not have any, but at least try to suggest which variables might be expected to be useful, relevant or important predictors.
  3. Methodology. The methodology, results, and discussion should focus on the question you studied, rather than on a catalog of steps of analysis or of tables of results.

    Describe here the models and methods used to analyse your data. If necessary, explain why the methods are appropriate and how you looked at the data to examine assumptions, justify transformations, etc. Be specific. (``MPG was transformed to -1/MPG, and log(PRICE) was used to achieve a linear relation'' rather than ``some variables were transformed''.) What methods were used to select your final model? Omit description of general aspects of statistics-- assume your reader knows what a multiple regression model is.

  4. Results. Results of analyses should be presented simply and clearly. Use graphical displays, tables, and discussion. Undigested computer output is not appropriate here. Reference to figures, tables or labelled points in computer output in an appendix may be useful. It is not necessary to write out formal tests of hypotheses as you would on an assignment -- this is supposed to be a research report.
  5. Interpretation and/or Discussion. The questions to be addressed are: Does the prediction equation make sense? Do the predictors seem reasonable? What do the signs and magnitudes of the regression equations mean?

    This section should also describe the scientific and statistical issues raised by the results described in the preceding sections. Suggestions for further analysis or other data are appropriate. Summarize (again) your conclusions about the scientific questions and back up your assertions with references to your Results: graphs, tables, etc.

  6. Appendices. There can be one or more appendices. You will probably want to include at least one appendix logging the details of the computations. The output should be trimmed and annotated. You do not need to include the raw data.

    All output should be labeled (e.g., Fig. 1, Table 10, or Exhibit 2) and referred to in the text (eg., "see Exhibit 2"). Output should be summarized where possible or compressed (deleting uninformative stuff), rather than just inserted whole, especially where it deals with subsidiary issues.

  7. Style. I do not insist on any specific formatting style, but here are some tips:
    • Use headings for each section/subsection consistently, and effectively to communicate the structure of your ideas, as well as their content. You need not follow the headings suggested above slavishly.
    • Make it easy for the reader by refering to specific page, table, figure, etc. in the text to indicate where the relevant information may be found.
    • Cite sources (where appropriate) in the text as Author (year), putting bibliograpic sources in a References section at the end (before the Appendix)

Experiment Design and Analysis Projects