Bringing tools for media monitoring to the public: outcomes of the ODYCCEUS summer school on ‘Democracy in the Age of Big Data and AI’ (FETFX Future Tech Week contribution)

Bringing tools for media monitoring to the public: outcomes of the ODYCCEUS summer school on ‘Democracy in the Age of Big Data and AI’ (FETFX Future Tech Week contribution)


Social science is undergoing a revolution. The rise of social media, smart cities, digital archives, quantified selving, and personal agents is providing an unprecedented amount of data about the social relations and behaviors of very large groups of people, creating an opportunity to a more empirically grounded social theory. But this opportunity can only be realized by using powerful tools: for analyzing textual data, for finding patterns in data, and for visualizing those patterns in a way that brings out their meaning. These tools increasingly rely on complex systems science and Artificial Intelligence.

 At the same time we see that collective decision-making in contemporary societies is also undergoing profound change. Social media, Big Data, and AI are already having a profound impact on political and social processes. They have empowered ‘citizen-driven’ political movements such as the Arab spring or the French ‘Gilets Jaunes’ protests. They also provide new means to strategically influence public opinion as it was seen before the the Brexit refendum and the US presidential elections in 2016. Similarly, they could lead to rapid polarizations in political debates, such as those concerning migration or identity, and accelerate the spreading of fake news or hate speech.

These dynamics were the focus of the recent ODYCCEUS summer school on ‘Democracy in the Age of Big Data and AI’. In honour of the FETFX Future Tech Week, this page presents some of the summer school’s outcomes, demonstrating the on-going development and potential applications of media-monitoring tools in the Penelope ecosystem.    

Building (social) media observatories

The ODYCCEUS summer school mixed theoretical lectures with practical hands-on sessions and ateliers to examine how novel tools of computational social science can help us understand these phenomena and possibly facilitate future democratic decision processes. It introduces social scientists and media researchers to the latest methods, tools, and techniques and introduces AI researchers and complex systems scientists to the approaches and issues of social science so that they can come up with new tools or refine existing ones.

More specifically, participants learned how they could build Opinion Observatories that tap into social media to collect information about how certain actors are trying to manipulate political opinion in elections, how fake news gets fabricated and spreads, or how opinions get polarized and shift. To this end, participants engaged in a series of ateliers guided by tutors, each of which addressed a case study. Case studies considered a specific contentious issue using specific data sources.

Case 1: Climate Change (tutored by Artificial Intelligence Lab, Vrije Universiteit Brussel)

In this workshop, participants used data from the news website of the Guardian and Penelope components to develop a ‘science tracker’: a pipeline to reveal and track references to scientific publications and research institutions in news articles and comments related to climate change. Participants have thus built components to trace the relationships between scientific references and articles and those in news website commentaries, the occurrence of scientific references in articles over time, etc. As such, they have laid a basis for a further exploration of the role of scientific literature within the climate change debate.   

Case 2: Antagonisms, coalitions and populism (tutored by Max Planck Institute for Mathematics in the Natural Sciences, Leipzig)

The participants of the workshop decided to focus on a very specific discourse, which took place in the shadow of the big Brexit topic dominating UK politics:  The debate about badger cullings in the UK due to the high number of cases of bovine tubercolosis among cattle. The illness is partly spread by the badger population in the UK and Department for Environment, Food and Rural Affairs has hence initiated culling of badgers to reduce the reservoir of infection in wildlife.

Two data sources were used in the workshop: Twitter posts of the week of the Summer School, and all parliamentary speeches by the UK House of Commons since 2016. The aim was to produce an analysis allowing and making use of both close and distant reading to gain a comprehensive view of the debate.

Starting off with relatively close reading of the culling debate in the UK parliament, a new component was introduced and applied to the speeches: Statement graphs. There, sentences including the same words are linked to each other, which yields an overview about the language and topic structure of speeches and can be used to visualize differences and similarities in the use of expressions and words of different parties and politicians, or at different times.

The participants then analyzed the language of the different parties in the UK parliament, and found that the two major parties of the UK talk differently about the cullings: Labour politicians in a more detailed and emotional way, Tories in more abstract and rational terms.

Twitter was used to get an impression about the discourse on social media. Traffic on Twitter was accelerated by the announcement of extending the culling to eleven new areas on September 11. Analysis of the retweet network of posts including ‘badger‘ and ‘cull‘ showed that, to a big extent, only the voices opposing the culling measures were present and shared on Twitter. Statements in favour of the cullings were largely missing. An interesting follow-up question hence arose: How do we deal with data that is not there?

The results showed that even for a ‘minor‘ political topic (compared to and overshadowed by the very heated Brexit debate going on at the same time), the methods and pipelines the participants used and developed provided valuable insights on different scales of data aggregation and visualization, from close to distant reading. This is what should be pursued further in the on-going development of Penelope. The simple visualization techniques of the statement and retweet graphs turned out to provide a valuable systematic perspective on the data. Also more qualitatively oriented researchers can profit from those tools, that we plan to provide to Penelope soon.

Click here for the full presentation

Click here for a demonstrator with the statement graph

Case 3: Hate and excitable speech (tutored by the University of Amsterdam (Digital Methods Initiative))  

(More content coming soon)

Case 4: Migration and borders (tutored by the University of Paris, 7)

(More content coming soon)

Tom Willaert

Comments are closed.