按照glm函数的帮助文档,这两种表示方法应该是一样的:
但是实际计算结果不同。说明这两种格式的结局变量的含义不同,那这个不同之处在哪呢?
代码如下:
df1
race dis sucess fail
1 58.40 17 61 49
2 92.40 7 0 15
3 18.28 15 4 38
4 59.38 0 1790 14
5 55.51 7 0 15
6 58.40 12 16 52
7 83.39 2 31 11
8 94.54 9 87 67
9 88.54 0 107 11
10 77.19 4 90 12
11 38.75 7 6 17
df2 <- df1 %>%
mutate(y = if_else(sucess>fail,1,0))
df2$y <- as.factor(df2$y)
mod1 <- glm(cbind(sucess,fail)~dis+race,data = df,family = 'binomial')
mod2 <- glm(y~dis+race,data = df,family = 'binomial')
res1 <- exp(cbind(OR=coef(mod1),confint(mod1)))
res2 <- exp(cbind(OR=coef(mod2),confint(mod2)))
res1
OR 2.5 % 97.5 %
(Intercept) 68.1711286 38.2284603 122.6115988
dis 0.7469668 0.7276167 0.7659900
race 0.9868600 0.9793819 0.9944779
res2
OR 2.5 % 97.5 %
(Intercept) 0.06943074 2.541349e-05 31.938780
dis 0.92467344 6.551580e-01 1.226807
race 1.05313081 9.817307e-01 1.170621