想做一个健康与是否参加保险的logistic的东西,把健康状况分成5类用1~5表示,该怎么写代码;

是 glm(join~health,family=binomial)

还是 glm(join~factor(health),family=binomial)

回复 第1楼 的 skyindeer:因为是顺序统计量,我觉得用前者比较好吧。

回复 第1楼 的 skyindeer:你的health又不是连续变量,当然是factor了

回复 第1楼 的 skyindeer:更正下,应该是as.factor(health)

晕了,一般的glm应该不能解决吧。看看这个包gnlm里的ordglm函数:

ordglm {gnlm}:

fits linear regression functions with logistic or probit link to ordinal response data by proportional odds.

7 天 后

回复 第5楼 的 ming_uld:I guess this is ordinal response, not ordinal predictors.

回复 第1楼 的 skyindeer:Actually, both ways are possible. The second is more general (i.e., covering more possibilities), but also costs more parameters. If you have some prior belief about the way how health status affects the probability of joining an insurance plan, something like the first might be more powerful. This is similar to the choice between the trend test and the chi-square test.

1 年 后

问题:

用logist做广义回归,predictor之一是factor,不是连续的,用字母和数字表示结果不太一样:用数字表示,结果可以给出这个predictor的显著性。但是用字母之后,结果就变成了对predictor的每一个level都给了一个显著性值,对这个predictor却没有了显著性值。这是为什么呢?

如果想得到predictor的显著性值,必须把factor改为数字的连续变量吗?

</p>
<p>> options(contrasts = c("contr.treatment", "contr.poly"))<br />
> ldose <- rep(100:105, 2)<br />
> numdead <- c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16)<br />
> sex <- factor(rep(c("M", "F"), c(6, 6)))<br />
> SF <- cbind(numdead, numalive = 20 - numdead)<br />
> budworm.lg <- glm(SF ~ sex*ldose, family = binomial)<br />
> summary(budworm.lg, cor = F)</p>
<p>Call:<br />
glm(formula = SF ~ sex * ldose, family = binomial)</p>
<p>Deviance Residuals:<br />
     Min        1Q    Median        3Q       Max<br />
-1.39849  -0.32094  -0.07592   0.38220   1.10375  </p>
<p>Coefficients:<br />
            Estimate Std. Error z value Pr(>|z|)<br />
(Intercept) -93.5972    17.2140  -5.437 5.41e-08 ***<br />
sexM        -35.1163    27.6925  -1.268    0.205<br />
ldose         0.9060     0.1671   5.422 5.89e-08 ***<br />
sexM:ldose    0.3529     0.2700   1.307    0.191<br />
---<br />
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 </p>
<p>(Dispersion parameter for binomial family taken to be 1)</p>
<p>    Null deviance: 124.8756  on 11  degrees of freedom<br />
Residual deviance:   4.9937  on  8  degrees of freedom<br />
AIC: 43.104</p>
<p>Number of Fisher Scoring iterations: 4</p>
<p>> ldose <- rep(letters[1:6],2)<br />
> budworm.lg <- glm(SF ~ sex*ldose, family = binomial)<br />
Warning message:<br />
In model.matrix.default(mt, mf, contrasts) :<br />
  variable 'ldose' converted to a factor<br />
> summary(budworm.lg, cor = F)</p>
<p>Call:<br />
glm(formula = SF ~ sex * ldose, family = binomial)</p>
<p>Deviance Residuals:<br />
 [1]  0  0  0  0  0  0  0  0  0  0  0  0</p>
<p>Coefficients:<br />
             Estimate Std. Error z value Pr(>|z|)<br />
(Intercept)   -25.752  52998.328   0.000        1<br />
sexM           22.807  52998.328   0.000        1<br />
ldoseb         23.555  52998.328   0.000        1<br />
ldosec         24.904  52998.328   0.000        1<br />
ldosed         25.752  52998.328   0.000        1<br />
ldosee         26.157  52998.328   0.000        1<br />
ldosef         27.138  52998.328   0.001        1<br />
sexM:ldoseb   -21.996  52998.328   0.000        1<br />
sexM:ldosec   -22.161  52998.328   0.000        1<br />
sexM:ldosed   -22.188  52998.328   0.000        1<br />
sexM:ldosee   -21.016  52998.328   0.000        1<br />
sexM:ldosef     1.558  74950.923   0.000        1</p>
<p>(Dispersion parameter for binomial family taken to be 1)</p>
<p>    Null deviance: 1.2488e+02  on 11  degrees of freedom<br />
Residual deviance: 5.2389e-10  on  0  degrees of freedom<br />
AIC: 54.11</p>
<p>Number of Fisher Scoring iterations: 22</p>
<p>><br />
</p>

如何把glm结果中的Coefficients中的Pr提取出来呢?

budworm.lg$coefficient并没有这个信息。

<br />
> summary(budworm.lg, cor = F)</p>
<p>Call:<br />
glm(formula = SF ~ sex * ldose, family = binomial)</p>
<p>Deviance Residuals:<br />
     Min        1Q    Median        3Q       Max<br />
-1.39849  -0.32094  -0.07592   0.38220   1.10375  </p>
<p>Coefficients:<br />
            Estimate Std. Error z value Pr(>|z|)<br />
(Intercept) -93.5972    17.2140  -5.437 5.41e-08 ***<br />
sexM        -35.1163    27.6925  -1.268    0.205<br />
ldose         0.9060     0.1671   5.422 5.89e-08 ***<br />
sexM:ldose    0.3529     0.2700   1.307    0.191<br />
</p>

回复 第9楼 的 pengchy:

找到方法了:

<br />
> bud.sum <- summary(budworm.lg, cor = F)<br />
> bud.sum$coeff<br />
              Estimate Std. Error    z value     Pr(>|z|)<br />
(Intercept) -2.9935418  0.5526997 -5.4162175 6.087304e-08<br />
sexM         0.1749868  0.7783100  0.2248292 8.221122e-01<br />
ldose        0.9060364  0.1671016  5.4220678 5.891353e-08<br />
sexM:ldose   0.3529130  0.2699902  1.3071324 1.911678e-01<br />
><br />
</p>