Professional Documents
Culture Documents
Assignment III
Tool - R
Rationale - it is scripting language which can be used quite effectively in Word Cloud
formation and analysis. Our version is 3.2.1
Procedure
We first installed the packages required to run a word cloud-
The following script was run to achieve the same:
install.packages("ROAuth")
install.packages("bitops")
install.packages("digest")
install.packages("rjson")
install.packages("NLP")
install.packages("twitteR")
install.packages("stringr")
install.packages("ggplot2")
install.packages("tm")
install.packages("RColorBrewer")
install.packages("wordcloud")
install.packages("RCurl")
install.packages("httpuv")
install.packages("plyr")
install.packages("RJSONIO")
install.packages("httr")
The next very important step is to codify the API keys from my account.The following
codec does just that:
# set API key and API secret from Twitter developer site
reqURL <- https://api.twitter.com/oauth/request_token
accessURL <- https://api.twitter.com/oauth/access_token
authURL <- "https://api.twitter.com/oauth/authorize"
#Generate the accessToken after creating the app in twitter, replace with your
values
#the below values dont work
apiKey <- "sb0mWFVbEFNtJnBQO0fWRUcV"
apiSecret <- "7XRvv9FrrL77Z2mHcecF9pygon4GjHtRw49J5RQA3jHWBVpY7"
dim(tweets)
#Building the corpus
corpus = Corpus(VectorSource(tweets$text))
corpus[[3]]
Now if we need to analyze word clouds using a machine interface like R we need
to first prep the source. The prepping was done by converting to lower case then
removing punctuation and finally forming a stemmed corpus
# Lower Case
corpus = tm_map(corpus, content_transformer(tolower))
corpus[[1]]
#Remove punctuation
corpus = tm_map(corpus, removePunctuation)
corpus[[2]]
#Remove Stop Words
stopwords("english") [1:1000]
corpus = tm_map(corpus, removeWords, c("Android ", stopwords("english")))
corpus[[1]]
#Stemming
corpus = tm_map(corpus, stemDocument)
corpus[[1]].
The last and the most important step is the Word Cloud Formation-
Output
CONCLUSION
we can conclude from the word cloud that the most happening things concerning
Android on Twitter are androidgam which is probably stemmed form of Android
gaming and something called as Gameinsight