Some months ago, I played with Un*x command-line tools to parse my tweets fetched from BackupMyTweets. Here is a more elegant to do so with R.
Well, the code is rather simple and most of what we need is already available through the twitteR package:
library(twitteR)
library(stringr)
my.tweets <- userTimeline("chlalanne", n=1000)
Suppose I want to display the frequency of tags I use in my messages:
find.tag <- function(x) unlist(str_extract_all(x$getText(), "#[A-Za-z0-9]*"))
# a little test to see whether it works or not
# for (i in 1:20) cat(i, ":", find.tag(my.tweets[[i]]), "\n")
my.tags <- lapply(my.tweets, function(x) try(find.tag(x), silent=TRUE))
sort(table(unlist(my.tags)), decr=TRUE)
To get the number of records I have:
me <- getUser("@chlalanne")
me$statusesCount # or statusesCount(me)
(It works without the @
too.)
We can make a quick and dirty word cloud as follows:
library(snippets)
wcl <- table(unlist(my.tags))
names(wcl) <- str_replace_all(names(wcl), "#", "")
cloud(wcl[wcl > 5])
Other random notes:
help(registerTwitterOAuth)
, that I didn’t explore much at the moment.twitteR
to update analysis status online was raised on [Stack Overflow](How to insert variables in R twitteR updates?). (Note that it uses the older R API so that commands like initSession()
are no longer available).