大家好,又来麻烦大家了,我是用的win10+Rstudio;R version 3.4.2 ;Platform: i386-w64-mingw32/i386 (32-bit)
最近在跟着网上的教程学习使用R文本挖掘以及绘制词云图,但是我做出的图全是乱码,使人令人疑惑,小弟希望在此向大家求教解决办法,以下是代码、测试文本以及输出的乱码图:
setwd("C:/Users/2015/Desktop/test")
library(tm);
library(tmcn);
yuliaoku<-Corpus(DirSource("C:/Users/2015/Desktop/test",pattern = "*.txt"),readerControl = list(language ="UTF-8"))
yuliaoku<-tm_map(yuliaoku,stripWhitespace)
library(rJava);
library(Rwordseg);
yuliaoku<-tm_map(yuliaoku,content_transformer(segmentCN),returnType="tm")
control<-list(wordLengths=c(1,5),stopwords=stopwordsCN())
mt<-TermDocumentMatrix(yuliaoku,control = control)
vmt<-as.matrix(mt)
val<-sort(rowSums(vmt),decreasing = TRUE)
df<-data.frame(word=names(val),freq=val)
library(wordcloud)
wordcloud(df$word,df$freq,min.freq = 3,random.order = FALSE,colors = rainbow(length(row.names(vmt))))
文本链接:https://pan.baidu.com/s/1cavXIA
谢谢大家