数据见附件

   name Num control power wisdom political charm loyalty coutry origin status

1   夏侯惇   1      94    96     62        56    78      99      1      1      1

2     许褚   2      83    97     26        16    68      89      1      1      1

3     荀攸   3      60    38     94        91    80      86      1      1      4

4     荀彧   4      64    35     97        90    84      80      1      1      4

5     张合   5      88    93     61        54    62      85      1      1      1

6     程昱   6      82    25     91        80    74      89      1      1      4

7     张辽   7      91    90     82        69    85      88      1      1      1

8     于禁   8      77    74     51        48    60      85      1      1      1

9     曹仁   9      79    83     61        58    68      95      1      1      1

10    曹洪  10      76    75     45        42    70      92      1      1      1



转换后的数据:kingdom3

items              transactionID

1   {name=夏侯惇,                  

     control=统帅,                  

     power=超级,                    

     wisdom=普通,                  

     political=普通,                

     charm=喜欢,                    

     loyalty=死士}               1  

2   {name=许褚,                    

     control=将军,                  

     power=超级,                    

     wisdom=弱智,                  

     political=普通,                

     charm=普通,                    

     loyalty=死士}               2  



函数具体使用见文档。
<br />
library(arules)<br />
<br />
# 转换数值类型数据<br />
 king[["power"]]<-ordered(cut(king[["power"]],c(0,60,80,95,110)),label=c("普通","强悍","高手","超级"))<br />
 king[["wisdom"]]<-ordered(cut(king[["wisdom"]],c(0,50,75,90,100)),label=c("弱智","普通","聪明","智慧"))<br />
 king[["control"]]<-ordered(cut(king[["control"]],c(0,50,75,90,100)),label=c("小兵","校尉","将军","统帅"))<br />
 king[["political"]]<-ordered(cut(king[["political"]],c(0,60,78,90,105)),label=c("普通","战术家","政治家","战略家"))<br />
 king[["charm"]]<-ordered(cut(king[["charm"]],c(0,55,75,90,100)),label=c("讨厌","普通","喜欢","吸引"))<br />
 king[["loyalty"]]<-ordered(cut(king[["loyalty"]],c(-1,55,71,85,101)),label=c("奸贼","普通","忠臣","死士"))<br />
<br />
kingdom<-data.frame(name=king[,1],king[,3:8])<br />
<br />
#转换为transactions结构<br />
kingdom3<- as(kingdom, "transactions")<br />
 <br />
itemFrequencyPlot(kingdom3, support = 0.5, cex.names = 0.8)<br />
<br />
image(kingdom3)<br />
itemFrequency(kingdom3, type = "relative")<br />
itemFrequency(kingdom3, type = "absolute")<br />
#分别用apriori、eclat函数<br />
rules <- apriori(kingdom3, parameter = list(support = 0.01,confidence = 0.6))<br />
fsets <- eclat(kingdom3, parameter = list(support = 0.05), control = list(verbose = FALSE))<br />
<br />
summary(rules)<br />
summary(fsets)<br />
<br />
rulesControl <- subset(rules, subset = rhs %in% "control=统帅" & lift > 1.2)<br />
rulesWisdom  <- subset(rules, subset = rhs %in% "wisdom=智慧" &lift > 2.2)<br />
inspect(rulesControl)<br />
inspect(rulesWisdom)<br />
inspect(head(SORT(rulesControl, by = "confidence"), n = 3))<br />
<br />
singleItems <- fsets[size(items(fsets)) == 4]<br />
inspect(singleItems)<br />
<br />
<br />
WRITE(rulesControl, file = "data.csv", sep = ",", col.names = NA)<br />
<br />
<br />
> inspect(rulesControl)

   lhs                   rhs               support confidence      lift

1  {political=战略家,                                                  

    loyalty=普通}     => {control=统帅} 0.01333333  0.6666667  7.142857

2  {power=超级,                                                        

    charm=吸引}       => {control=统帅} 0.02000000  1.0000000 10.714286

3  {wisdom=智慧,                                                      

    charm=吸引}       => {control=统帅} 0.02000000  1.0000000 10.714286

4  {political=政治家,                                                  

    charm=吸引}       => {control=统帅} 0.02000000  0.6000000  6.428571

5  {charm=吸引,                                                        

    loyalty=死士}     => {control=统帅} 0.04000000  0.6666667  7.142857

6  {power=超级,                                                        

    wisdom=聪明}      => {control=统帅} 0.02000000  1.0000000 10.714286

7  {power=超级,                                                        

    charm=喜欢}       => {control=统帅} 0.02000000  1.0000000 10.714286

8  {power=超级,                                                        

    political=战术家} => {control=统帅} 0.02000000  1.0000000 10.714286

9  {power=超级,                                                        

    loyalty=忠臣}     => {control=统帅} 0.01333333  1.0000000 10.714286
如果加上说明分析文字,就更好了。不然很容易让人觉得云里雾里
自己跑一下基本就知道什么意思了~~~~



我还有两个疑问:

1.  inspect(rulesControl)的结果最后应该还有一列是 itemset, 不知道是什么意思?

2.  write命令确实可以将rulescontrol 以data frame的格式写出去,但是我想把这个结果赋给一个变量,这个变量改成data frame的有啥命令吧?

     譬如:我做一个线性回归 df.junk <- data.frame(aa=1:10,bb=11:20)

                         lm.out <- lm(df.junk$bb~df.junk$aa)

                         anova(lm.out)[1,5]   就可以将一次项的pvalue取出来

     但是我怎么将inspect(rulesControl) 的第一个lhs取出来呢?? as(items(rules), "list")[[1]] ? 这样的话,很难区分lhs,rhs



R 2.92版本arule 包增加了好多data class,很不太习惯

有的时候不知道怎么去查一个东西的内容。

譬如:上面的kingdom3, 要想查看它的内容 需要LIST(kingdom3, decode = T),   单独的kingdom3,kingdom3[1:i],都只是出来一句话,不像查看data frame,matrix or vector 那样方便!!



难道是我的level还不够,还不知道快捷的方式??

牛人呀,知道的出来吼两声



  
itemsets是项集的意思,理解了apriori算法也就知道它是什么东西。

lhs、rhs应该是items 或itemMatrix类型的数据。

在下也是初学者,也有许多不懂得东西,不能回答你的疑问。  [s:15]  [s:12]  [s:11]  [s:11]
俺知道是项集的意思,但是它下面的数字怎么理解不知道,我猜了好几种,都不对呀
Share 一下~~~~~~

LIST(lhs(rules) , decode=T)

LIST(rhs(rules), decode=T)
不好意思,还忘了一个labels(rules)

取出来的就是类似这样{political=战略家,                                                  

    loyalty=普通}     => {control=统帅}} 这样的关联规则
From the denition of the association rule mining problem we see that transaction databases

and sets of associations have in common that they contain sets of items (itemsets) together with

additional information. For example, a transaction in the database contains a transaction ID and

an itemset. A rule in a set of mined association rules contains two itemsets, one for the LHS and

one for the RHS, and additional quality information, e.g., values for various interest measures.

Collections of itemsets used for transaction databases and sets of associations can be represented

as binary incidence matrices with columns corresponding to the items and rows corresponding to

the itemsets.
支持度(support):是关联规则重要性(或适用性)的衡量,值越大,则规则具有较好的代表性。

可信度(confidence):测度后项对前项的依赖程度,是一个相对指标,是对关联规则准确度的衡量,值越大,则说明规则Y依赖于X的可能性比较高(X=>Y).

提升度(lift):大于1时,表示X=>Y规则的集中度比较高。
4 个月 后

无意中看到这帖,本身是三国迷,而且以前常玩《三国志》,想要添加搂主位好友却不晓得该怎么添加。

在下也曾经玩过曾经的三国

现在不行了…………

qxde01@gmail.com

3 年 后