Whilst I love text mining a classic work of western literature, this time I decided to stay in the present century with the twitter feeds of the leaders of the three major parties heading into a downunder federal election.
Malcolm Turnbull is the prime minister and leads the liberal party, he’s in blue. Bill Shorten is the opposition leader and head of the labor party, he’s in red. Richard di Natale is the leader of the Greens party, he’s in green because I had no choice there.
The outcomes are pretty interesting: apparently AMP was A.Big.Deal. lately. “Jobs” and “growth” were the other words playing on repeat for the two major parties.
Each one has a distinct pattern, however. Turnbull is talking about AMP, plans, jobs, future, growth. Shorten is talking about labor, AMP, medicare, budget, schools and jobs. Di Natale is talking about the Greens, AMP (again), electricity, and auspol itself. Unlike the others, Di Natale was also particularly interested in farmers, warming and science.
The programming and associated sources are pretty much the same as for the Aeneid word cloud, except I used the excellent twitteR package you can find out about here. This tutorial on R Data mining was the basis of the project. The size of the corpi (that would be the plural of corpus, if you speak Latin) presented a problem and these resources here and here were particularly helpful.
For reference, I pulled the tweets from the leaders’ timelines on the evening of the 23/05/16. The same code gave me 83 tweets from Bill Shorten, 59 from Malcolm Turnbull and 33 from Richard Di Natale: all leaders are furious tweeters, so if anyone has any thoughts on why twitteR responded like that, I’d be grateful to hear.
The minimum frequencies for entering the word clouds were 3 per word for Shorten with a greater number of tweets picked up, but only 2 for Di Natale and Turnbull, due to the smaller number of available words.
I’ll try this again later in the campaign and see what turns up.