最近处理一个 1215*10 维的数据,1215个观测样本,10个变量,用GLM做二分类
<br />
trndata$y<-as.factor(trndata$y)<br />
model.full<-glm(formula=y~.,family=binomial(),data=trndata,na.action=na.exclude,<br />
+ control=list(epsilon=0.0001,maxit=500,trace=T))<br />
用R从昨晚就开始计算,今早看到的输出结果如下:
Deviance = 204.6490 Iterations - 1
Deviance = 70.89586 Iterations - 2
Deviance = 25.55842 Iterations - 3
Deviance = 9.33467 Iterations - 4
Deviance = 3.425006 Iterations - 5
Deviance = 1.258774 Iterations - 6
Deviance = 0.4629131 Iterations - 7
Deviance = 0.1702740 Iterations - 8
Deviance = 0.06263731 Iterations - 9
Deviance = 0.02304257 Iterations - 10
Deviance = 0.008476834 Iterations - 11
Deviance = 0.003118446 Iterations - 12
Deviance = 0.001147211 Iterations - 13
Deviance = 0.0004220352 Iterations - 14
Deviance = 0.0001552581 Iterations - 15
Deviance = 5.711625e-05 Iterations - 16
Deviance = 2.101189e-05 Iterations - 17
Deviance = 7.729843e-06 Iterations - 18
Deviance = 2.843651e-06 Iterations - 19
错误: 无法分配大小为45.0 Mb的向量
此外: Warning messages:
1: Reached total allocation of 511Mb: see help(memory.size)
2: Reached total allocation of 511Mb: see help(memory.size)
3: Reached total allocation of 511Mb: see help(memory.size)
4: Reached total allocation of 511Mb: see help(memory.size)
请问,这个问题是由于样本量太大了? 还是模型构建的问题? 或是数据存在复共线性的问题(应该不存在复共线性,同样的代码,在S-Plus中可以做)