SAS PROC GLM中CONTRAST,MEANS ,LSMEANS的应用与区别？

jiangshq · 2007年4月27日

在做方差分析的时候经常会遇到两两比较的情况，单因素方差分析的两两比较我们基本都会做，如果是两因素的如析因设计，重复测量资料的方差分析，等牵涉到交互效应的的两两比较时如何完成，SAS给出了contrast，means，lsmeans等语句，这些语句在应用上有什么区别，请高手指点。

rtist · 2007年5月2日

contrasts: any linear contrasts, not necessarily comparing means

lsmeans: only least squared means estimation, sometimes with comparisons

menas: I didn't find it to be useful

http://cos.name/bbs/read.php?tid=5505

tsingkong · 2008年1月2日

A common question asked about GLM is the difference between the MEANS and LSMEANS statements. In some cases they are equivalent and at other times LSMEANS are more appropriate. The definition of each is as follows:

MEANS -

These are what is usually meant by mean (average) and are computed by summing all the data points and dividing by the total # of points. They are also referred to as arithmetic means and they are based on the data only.

LSMEANS -

Least Squares Means can be defined as a linear combination (sum) of the estimated effects (means, etc) from a linear model. These means are based on the model used.

In the case where the data contains NO missing values, the results of the MEANS and LSMEANS statements are identical. When missing values do occur, the two will differ. In such a case the LSMEANS are preferred because they reflect the model that is being fit to the data. LSMEANS are also used when a covariate(s) appears in the model such as in ANCOVA (See handout # 4).

The following example illustrates the similarity and difference between theses two methods in balanced and unbalanced data.

EXAMPLE:

This data set has a factor A with 3 levels (1, 2, & 3) with 3 reps of each.

Means for the 3 levels of factor A (a.) are given below each respective column.

A MEANS statement would calculate the overall mean of factor A by summing all 9 data points & dividing by 9,

The LSMEANS statement would use a linear combination of the estimated factor A effects, which in this case are the factor A means, a.,

Since the data were balanced the two methods produced the same result. If we delete a data point however, the results will change. Suppose the data were revised as follows:

Note that the second level of factor A at rep 1 is now deleted and hence the sums and means are updated to reflect this change.

The MEANS statement now produces:

whereas the LSMEANS gives:

Thus, when the data includes missing values, the average of all the data will no longer equal the average of the averages. LSMEANS is the proper choice here because it imposes the treatment structure of factor A on the calculated mean .. There is no inherent structure implied by the MEANS statement. The exact difference between MEANS and LSMEANS becomes more obscure with increasingly complex treatment arrangements and experimental designs. When covariates are present in the model, the LSMEANS statement produces means which are adjusted for the average value of the specified covariate(s).

quote online!!!

51RCode · 2013年11月5日

学习了，谢谢大神们！