本人生信小白一个,最近在尝试对demo数据进行如题所述分析,主要目的是看筛选出来的基因(共8列)能否很好的对两组进行分类,以及分类效果如何,数据df如下:
ID group gene1 gene2 gene3 gene4 gene5 gene6 gene7 gene8
1 hE 0 0 0 0.0000838 0.000353 0.0000623 0.0000138 0.002468529
2 hE 0 0 0 0.000106 0.0000269 0.0000582 0.00000405 0.000737
3 hE 0 0 0 0 0 0.000213 0 0.011905089
4 hE 2.83E-06 0 0 9.71E-05 0.00032227 1.51E-05 1.21E-06 0.00285351
5 hE 0 0 0 0.0000927 0.0000106 0.000838 0 0.000968
6 hE 0 0 0 0.000196058 3.63E-05 0 0 0.001628633
7 hE 0 0 0 4.04E-05 0 4.56E-05 0 0.004478178
8 hE 6.53E-06 0 0 0.000125141 2.06E-05 3.46E-05 2.11E-07 3.41E-05
9 hE 0 0 0 0.00010123 3.62E-05 0 0 0.005537776
10 hE 0 0 0 0.000163166 4.51E-05 9.94E-05 8.52E-05 0.000926297
15 PAA 0 0 0 0.000171725 5.29E-05 3.43E-05 8.32E-05 0.000124568
16 PAA 0 0.000658038 0.000810461 0.000152992 0.000667138 5.91E-05 4.55E-06 0.000188255
17 PAA 1.69E-05 0.00042295 0.000346181 0.000355866 0.001689648 0.000170041 0.000344387 0.000430484
18 PAA 7.61E-06 0 0 0.000747356 0.000195583 0.000220804 3.55E-05 0.001692674
19 PAA 7.49E-06 0 0 0.000157245 7.83E-06 2.96E-05 2.28E-05 0.000198428
20 PAA 5.56E-07 0 0 0.000110178 0.001436771 0.000156921 1.50E-05 1.11E-05
21 PAA 1.01E-05 0 0 0.000138107 8.50E-05 6.01E-05 3.36E-07 3.06E-05
22 PAA 0 0 0 0.000342315 2.50E-05 0.000123902 3.09E-05 0.000297557
23 PAA 1.12E-05 0 0 0.000120795 2.28E-05 6.19E-05 2.07E-05 0
24 PAA 5.75E-07 0 0 4.77E-05 0.000243906 0.000932244 1.24E-05 0.000424322
代码参考的@Susannalsy之前po出来的帖子上的:
library (caret)
library(randomForest)
library(pROC)
library(tidyverse)
df <- read.table('./TRY.txt',header = TRUE,quote="")
train_control<-trainControl(method = "LOOCV")
caret_rf<-train(group~.,data = df,method="rf",trControl=train_control,important = TRUE, proximity = TRUE)
pred <- caret_rf[["pred"]]
根据上述代码能得到pred obs rowIndex mtry的矩阵。请问这个结果就是最后的预测结果吗,是否还需要for循环?接下来该如何画ROC曲线、计算AUC面积呢?