> wordfreq2[1:10,]
wordlist Freq
4 Abbreviated Injury Scale 2
12 Acceleration 3
17 Acceptance 2
20 Accident 37
21 Accident Analysis 12
23 Accident Causation 5
34 Accident Investigation 5
39 Accident Modification Function 3
41 Accident Pattern 2
45 Accident Prevention 8
> keymatrix[1:2,]
col_names
row_names keyword 1 keyword 2 keyword 3 keyword 4
1 "Hierarchical Bayes" "Spatial Random Effect" "Uncorrelated Random Effect" "Negative Binomial Model"
2 "Real Time Crash Prediction Model" "Random Multinomial Logistic" "Bayes Belief Net" "Basic Freeway Segments"
col_names
row_names keyword 5 keyword 6 keyword 7 keyword 8 keyword 9 keyword 10 keyword 11 keyword 12
1 "Conditional Autoregressive Distribution" "Mixed Effect" NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA
col_names
row_names keyword 13
1 NA
2 NA
> nkeyword2
[1] 920
> length(keymatrix[,1])
[1] 1526
> length(wordfreq2[,1])
[1] 920
> y5
[1] 422741
上面是一些输入的部分展示。现在,下面的一段程序的运行时间太长,一天还运行不完。所以,我想把它优化一下。我虽然知道apply有类似功能,但是怎么也想不出关键的语句该怎么写。请各位高手指教。
> time=numeric(y5)
> weigh=numeric(y5)
> y5=1
> for(i in 1:(nkeyword2-1)){
+ for(j in (i+1):nkeyword2){
+ for(k in 1:nkey){
+ x1 =as.character(wordfreq2[i,1])==na.omit(keymatrix[k,])
+ x2 =as.character(wordfreq2[j,1])==na.omit(keymatrix[k,])
+ li =length(grep(TRUE,x1,fixed=TRUE))
+ lj =length(grep(TRUE,x2,fixed=TRUE))
+ if (li>0 & lj>0){
+ time[y5]= time[y5]+1 #计算共词频数
+ weigh[y5]= weigh[y5]+(1/choose(length(na.omit(keymatrix[k,])),2)) #计算权重
+ }
+ }
+ y5=y5+1
+ }
+ }
请各位高手多多指教。谢谢。
wordlist Freq
4 Abbreviated Injury Scale 2
12 Acceleration 3
17 Acceptance 2
20 Accident 37
21 Accident Analysis 12
23 Accident Causation 5
34 Accident Investigation 5
39 Accident Modification Function 3
41 Accident Pattern 2
45 Accident Prevention 8
> keymatrix[1:2,]
col_names
row_names keyword 1 keyword 2 keyword 3 keyword 4
1 "Hierarchical Bayes" "Spatial Random Effect" "Uncorrelated Random Effect" "Negative Binomial Model"
2 "Real Time Crash Prediction Model" "Random Multinomial Logistic" "Bayes Belief Net" "Basic Freeway Segments"
col_names
row_names keyword 5 keyword 6 keyword 7 keyword 8 keyword 9 keyword 10 keyword 11 keyword 12
1 "Conditional Autoregressive Distribution" "Mixed Effect" NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA
col_names
row_names keyword 13
1 NA
2 NA
> nkeyword2
[1] 920
> length(keymatrix[,1])
[1] 1526
> length(wordfreq2[,1])
[1] 920
> y5
[1] 422741
上面是一些输入的部分展示。现在,下面的一段程序的运行时间太长,一天还运行不完。所以,我想把它优化一下。我虽然知道apply有类似功能,但是怎么也想不出关键的语句该怎么写。请各位高手指教。
> time=numeric(y5)
> weigh=numeric(y5)
> y5=1
> for(i in 1:(nkeyword2-1)){
+ for(j in (i+1):nkeyword2){
+ for(k in 1:nkey){
+ x1 =as.character(wordfreq2[i,1])==na.omit(keymatrix[k,])
+ x2 =as.character(wordfreq2[j,1])==na.omit(keymatrix[k,])
+ li =length(grep(TRUE,x1,fixed=TRUE))
+ lj =length(grep(TRUE,x2,fixed=TRUE))
+ if (li>0 & lj>0){
+ time[y5]= time[y5]+1 #计算共词频数
+ weigh[y5]= weigh[y5]+(1/choose(length(na.omit(keymatrix[k,])),2)) #计算权重
+ }
+ }
+ y5=y5+1
+ }
+ }
请各位高手多多指教。谢谢。