The PoliTwi project: Detection of emerging political topics in Twitter
The PoliTwi project: Detection of emerging political topics in Twitter
By Sven Rill, Hof University of Applied Science.
The impact of social networks like Twitter has increased over the past few years, especially in the area of politics. In particular, Twitter provides a platform on which discussions on various topics can be detected earlier than other standard information channels. The following real world example illustrates this. During a press conference in the course of the visit of US President Barack Obama in Berlin, the German Federal Chancellor Angela Merkel said:
„Das Internet ist für uns alle Neuland.“ (The Internet is new territory for all of us.)
This statement was made on June 19, 2013 at 12:48 p.m. Already at 12:50 p.m., there was a first tweet with hashtag „#neuland“ (#new territory) on Twitter:
Figure: Tweet from @flueke: „#neuland. Ich glaub Angela Merkel hat jetzt ein Meme an der Backe.“ (#new territory. I think Angela Merkel is burdened by a meme now.)
Minutes later, many similar tweets were written making „#neuland“ a new Top Topic in Twitter. This example perfectly demonstrates how quickly news spread via Twitter.
We have designed a system called PoliTwi, which allows us to detect emerging political topics (Top Topics) in Twitter earlier than other standard information channels. The main focus lies on a fast detection based on a few tweets at an early stage of a discussion. We could show, that new topics appearing in Twitter can be detected right after their occurrence. We could show that new topics can be observed about one day earlier in our system compared to Google trends.
The system was tested by analyzing over 4 million tweets before and immediately after the 2013 German parliamentary elections, from April 2013 till September 2013. After that we implemented a second system called “PoliTwi US” for the English language and the US elections.
For our analysis, we have collected about 13,000,000 German and 58,000,000 English political tweets. At the start of the project we collected about 20,000 tweets per day in Germany and 440,000 tweets per day in the United States. These values increased continuously until the election days. On the respective election days we collected about
240,000 tweets in Germany and more than 1,000,000 tweets in the US. During the TV debate between the two German top candidates Angela Merkel and Peer Steinbrück, we measured the highest tweet rate with up to 1,200 tweets per minute.
All topics, extracted since the start of the project, are available at our project websites, see http://www.politwi.de and http://us.politwi.de.
Figure: Political topics of the previous hour (left) and the last day (right). The topic points (tp) rank the Top Topics on a scale between 0 and 100.
Our system automatically generates own tweets on a regular basis announcing actual Top Topics. For this, two Twitter channels are used, see https://twitter.com/politwi and https://twitter.com/politwius. There we communicate as many Top Topics as we can fit into a single tweet. These topics are sorted according to the Topic Value. As we have seen, the Twitter channels turned out to be the most effective way for distributing the Top Topics.
Furthermore, we have extended our system by a sentiment analysis component to detect the polarity of topics marked by hashtags. For this, we use special Twitter hashtags, called sentiment hashtags, which people use to tag their opinion about politicians or parties. Our idea is to build up relation graphs for emerging political topics enriched with information like context and polarity. These graphs can later be used to extend an existing web ontology or a semantic network by a new dimension. This will contribute to improve concept-level sentiment analysis methods that use such knowledge bases. We would like to answer questions like: Which polarity bears an upcoming topic, e.g., the hashtag “#neuland”, in this political context? How does the usage of this topic change the polarity of a tweet? To evaluate the polarity, we used sentiment hashtags and examined different time periods, concerning or not concerning the topic under investigation. We separated the period the new topic came up and defined two reference time periods before and after the emerging of the new topic. Then, we built the graph and analyzed, how the polarity of adjacent vertices is changed by the presence of the new topic. To do so, we measured the polarity in the reference time periods and compared it to the time period under investigation.
Planned extensions of the project are the inclusion of sentiment analysis to investigate whether topics discussed very contrarily arise faster than others and the exploitation of additional meta information, e.g., the geo-information, to examine the spacial distribution and the investigation of the possibility to derive a context from jointly occurring political topics.
References:
Sven Rill, Dirk Reinel, Jörg Scheidt, and Roberto V. Zicari. Politwi: Early detection of emerging political topics on twitter and the impact on concept-level sentiment analysis. Knowledge-Based Systems, 69(0): 24 – 33, 2014. ISSN 0950-7051.
doi: http://dx.doi.org/10.1016/j.knosys.2014.05.008
URL http://www.sciencedirect.com/science/article/pii/S0950705114001920