Interdisciplinary Approaches to Social Media Analysis
Social Media Analyses (SMA) are used both, in academia and in professional settings. Depending on the research agenda, different methodologies may be applied (Kanthawala et al. 2022; Rejeb et al. 2022). In our course, we focus on the academic exploration of Social Media. We place particular emphasis on questions related to media, politics, and society. This represents a confluence of communication science and political science, intertwined with computational methods.
Cultural Analytics
Cultural analytics, as explained in the introductory chapter of the book “Cultural Analytics” by Lev Manovich, is a field that uses computers to analyze and understand large amounts of cultural information or “big cultural data”. This might include exploring big collections of images, videos, or other media data to see patterns and trends that are happening in digital culture. Manovich talks about some key questions and challenges in cultural analytics. For example, one big question is whether we should focus on finding common themes and patterns in our data, or whether we should pay more attention to things that are unusual or rare. Also, while cultural analytics can be a powerful tool for understanding aspects of culture, especially in the digital world, Manovich tells us to be aware of its limits. He says that computers and data analysis can tell us a lot, but they can’t understand culture in the rich and deep way that humans can, especially when it comes to understanding things like aesthetics (beauty, style, etc.). So, while cultural analytics can help us see large scale patterns and trends in culture, Manovich advises us to also appreciate and be aware of what it can’t see or understand. The field of cultural analytics then becomes a space where we use computational tools to explore and question culture, while also being mindful of the limitations and challenges of using these tools (Manovich 2020).
Digital Methods
“Digital Methods,” as introduced by Rogers (2013), proposes a paradigm wherein the internet is both a site and a source for research, especially for social media studies. Unlike conventional research approaches that see the internet merely as a tool or data source, Rogers advocates for a methodology that is intrinsically web-centric, understanding and employing the unique dynamics and mechanics of the digital medium itself. An example for a digital methods research project is understanding algorithmic operations, especially of search engines like Google, and comprehending their impact on digital culture, information accessibility, and user engagement. This perspective is important to explore the foundations of how information is organized, ranked, and accessed online. Studying the digital medium itself means to study web-native phenomena such as hyperlink networks, search engine behaviors, and social media activities to uncover patterns, tendencies, and hierarchical structures within digital cultures and societies.
The concepts of cultural analytics and digital methods will guide us through our semester and our projects: We borrow the idea to use computational methods in order to understand “big cultural data” form Manovich and the concept of studying the digital medium itself from Rogers. Throughout the semester will enrich our projects through your own literature and theory based on the research interests. Beyond these foundations, we will borrow from i.e. the Computational Social Sciences (Lazer et al. 2009), the concept of Distant Viewing (Arnold and Tilton 2019), or Grammars of Action (Agre 1994; Gerlitz and Rieder 2018; Bainotti, Caliandro, and Gandini 2020; Omena, Rabello, and Mintz 2020), and Platform Vernaculars (Gibbs et al. 2015).
Legal & Ethical Challenges
This subchapter scratches the surface. Recommended reading: Haim (2023) pp. 62–69; 126–128.
When working with social media data, we’re dealing with personal information. As such we need to take into account legal and ethical considerations. From the legal perspective we need to focus on two aspects: The ownership of the data, and – when dealing with personal data – the GDPR. For the latter we need to take into account consent and should think about pseudonymisation or anonymisation of our data (Haim 2023). Further, the German Urheberrecht, the equivalent of the anglo-saxon copyright law (there are important differences, see Bundeszentrale für politische Bildung for a synopsis), defines exceptions for scientific research: I recommend the publication by Rat für Sozial- und Wirtschaftsdaten (RatSWD) (2019) which takes a closer look at the database law and provides some practical guidance (more in our slides).
The importance of the legal perspective social media research grew recently: Following the Cambridge Analytica scandal Meta platforms (like Instagram) started closing down on APIs, which would have offered a legal and accepted (by the plattform) point of access for researchers. I recommend to read McCrow-Young’s (2021) article, as she demonstrates how academic research may be interrupted by platform changes, like the closure of the Instagram-API in the wake of above incident. Post-API social media research found creative ways to access the data: Bainotti, Caliandro, and Gandini (2020), for example, took a unique approach for data collection by capturing Instagram content through YouTube videos. Recent publications on Instagram analyses, and most approaches in our future session, rely on crawling and scraping. Venturini and Rogers (2019) see a chance in the API-closure and argue that these techniques are “more than a ‘necessary evil’”, as it might force researchers to come back to (digital) field work.
Finally a word about reserach ethics. While the GDPR provides a rigid legal framework for dealing with personal information, I’d like to recommend the article “But the Data is Already Public” by Zimmer (2010). The article documents how, in a matter of days, an anonymous dataset of 1700 facebook profiles became (partly) deanonymized. Based on this case study, the author compiles ethical concerns for future research, which we should also incorporate into our work.
Methodology
In this chapter we are going to take a look at different methods for use with social media research, and particularly, with our projects. We are going to use (Visual) Content Analysis to understand the content of posts and stories. The concept of Plattform Affordances will help us understand these posts and stories as embedded in the platform and its available functions and options. Finally, the idea of Platform Vernaculars & Grammars serves as a guide to wire everything up, to discover patterns and trends in how users communicate and engage on these platforms.
(Visual) Content Analysis
We are going to apply quantitative content analyses to our corpora. For a quantitative approach we are going to operationalize our theory-based interests and questions using formal and / or content features. Next, we need to apply the operationalization to the documents, in form of human annotations or computational coding (see Döring and Bortz 2016). Döring and Bortz (2016) outline a general approach to content analysis, Rose (2016) in contrast concentrates on visual content analyses. She suggests four steps:
- “Finding your Images.
- Devising your categories for coding.
- Coding the images.
- Analysing the results.” – (Rose 2016 ch. 5)
The challenge of the first step is the sampling: Even with computational approaches, is it feasible to collect everything? The cultural analytics approach suggests such a goal, e.g. in order to obtain data and traces of subcultures. Due to practical limitations also Manovich’s works use an approach to break the large amount of available data into a smaller portion (see Hochman and Manovich 2013). This approach is called sampling, Rose (2016) introduces several sampling approaches like random, stratified, systematic, or cluster sampling. Döring and Bortz (2016) provide a deeper look into sampling strategies.
The codes, for the second step, may be devised from a qualitative exploration of the data or theories and related work. In context of our projects we are going to use both approaches: We will annotate a subset of our data as ground-truth while coding the total data using computational approaches. On code development there exists another large body of literature, like the Grounded Theory (e.g. Corbin and Strauss 2008) and Ethnic Coding Approach (Altheide 1987).
For the final analysis we are going to apply statistical data analyses. For an initial understanding of our data we will start with some exploratory analyses, e.g. plotting the data. In combination with the two approaches below, the platform affordances and platform vernaculars & grammars we may discover patterns of social media use. In most cases, our projects will compare different groups: These groups might be different user types (e.g. Politician Accounts vs. Party Accounts), or different Posts types (e.g. Posts vs. Stories), or different platforms (e.g. Instagram vs. TikTok).
Platform Affordances
Bossetta (2018) provides an overview of the concept of affordances and their application in social media analyses. He traces the term back to boyd and Papacharissi & Yuan who argued “that digital communication tech- nologies provide structural affordances to agents” (p. 473 Bossetta 2018). There are two important take-aways from his work: 1) The concept of affordances is not used consistently, and 2) the platforms shape affordances and thereby how users interact with the platform. Bainotti, Caliandro, and Gandini (2020) used the “Instagram-specific digital objects” as codes for their analysis of stories, linking the concept of affordances in the context of Instagram to the use of stickers.
In the context of our seminar we might consider the following elements as platform affordances:
TikTok | IG – Posts | IG – Stories |
---|---|---|
Likes | Likes | Sliders |
Comments | Comments | Votes |
Shares | Views | Questions |
Music | Mentions | Mentions |
Hashtags | Hashtags | Hashtags |
… | … | Locations |
… |
Did you spot the difference between some of the listed affordances? Likes and comments, for instance, are reactions to posts. Would you consider these features as affordances? Let’s discuss this is in class!
Platform Vernaculars & Grammars
Previous studies have looked into ‘grammars’ in Instagram stories. Originally linked to research on privacy (Agre 1994), grammars classify activities using specific types, making data collection and analysis easier. This uncovers patterns in user behavior, beneficial for purposes such as advertising. To the best of my knowledge, this concept was first used for social media data by Gerlitz and Rieder (2018) in a Twitter study.
Omena, Rabello, and Mintz (2020) discussed a “grammar of hashtags”, referring to the rules of hashtag use and how they’re organized on platforms. They suggest that hashtags, content visibility, and the nature of the content itself are essential in understanding hashtag use. Meanwhile, Bainotti, Caliandro, and Gandini (2020) used grammars to understand Instagram Stories, focusing on visual elements and their cultural meanings.
Lastly, Gibbs et al. (2015) examined the unique styles and logics of social media, termed “platform vernaculars”. These are influenced both by platform features and user habits.
Summary
In this chapter we have positioned ourselves between several disciplines: The computational social science, computational communication science, and digital humanities. In this position, we see social media data as trace data of human and social behaviour. The digitalness of our subject is, however, just one side of the coin: Follwing the theoretical frameworks of Digital Methods and Cultural Analytics, we want to conduct our analyses computationally with the aim to uncover patterns and trends of user behaviour on social media plattforms. Methodologically we can draw from quantitative content analysis, and the concept of platform affordances as features, and apply the concept of platform vernaculars and grammars to make sense of these features.
Additional Resources
Conferences
- International Conference on Social Media & Society
- IC²S² 2022
- ICWSM
- AoIR
- WebSci
- International Conference on CMC and Social Media Corpora for the Humanities
Journals
- New Media & Society
- Big Data & Society
Textbooks
Online Resources
Do you know of any ressources to be added to this list? Drop me a line: michael.achmann@ur.de.
References
Reuse
Citation
@online{achmann-denkler2023,
author = {Achmann-Denkler, Michael},
title = {Interdisciplinary {Approaches} to {Social} {Media}
{Analysis}},
date = {2023-10-23},
url = {https://social-media-lab.net/getting-started/theory.html},
doi = {10.5281/zenodo.10039756},
langid = {en}
}
Social Media Analyses in different contexts
Bridging this discussion, there are several disciplines pivotal to the academic analysis of social media data at this intersection: Lazer et al. (2009) outlined in an influencial article computational social science as an emerging field that built on the ability to collect and analyze vast amounts of data. The goal of the computational social science, according to this article, is to reveal patterns in human interactions, benefiting from various data sources such as emails, phone records, online social networks, and other digital traces left by individuals. We are going to concentrate on social media data, a type of data described by Quan-Haase and Sloan (2022a) as incidental, since the data exists and is being created, no matter the researchers observing them – or not. One special type of data, Instagram stories, even have an ephemeral character. 24 hours after posting the story expires – becoming invisible for followers and researchers alike (see also Leaver, Highfield, and Abidin 2020 on the importance of stories). Atteveldt and Peng (2018) noted a surge in the use of computational methods in communication science, attributing it to three primary factors: the availability of digital data, sophisticated data analysis tools, and the emergence of cost-effective, potent processing capabilities complemented by accessible computing infrastructure. Building on this perspective, Haim (2023) sees the computational communication science as a sub-discipline of communication science that addresses digitally altered objects of research, which require computational approaches to tackle to amount and complexity of this special type of data.
In the realm of digital humanities, computational approaches to text analysis have a long history, influenced by concepts such as distant reading (Moretti 2000) and macroanalysis (Jockers 2013). Manovich picks up these concepts in his cultural analytics, see below. Lately also distant viewing has been outlined, as “a methodological and theoretical framework for the study of large collections of visual materials” (Arnold and Tilton 2019). I see potential in integrating approaches and methods from the digital humanities into social media analysis. Vice versa, there’s also potential in utilizing methods used for social media analysis to address questions in the humanities.
Challenges for social media analyses have been outlined by Quan-Haase and Sloan (2022a): the role of theory, representativeness of data, scale, multimodality, data accessability, and legal and ethical considerations. Through our semester we are going to work on several of those challenges: In the Operationalization session we are going to talk about data-driven approaches (bearing in mind Anderson et al. 2008), as well as theories as basis for your research questions and operationalizations. The representativeness of data will be the challenge for our data collection sessions: We will not just answer how to collect data, but also what data to collect. The two challenges left are at the centre of our seminar: Our answer for the challenge of scale is to apply computational methods for data analysis, to process data at scale. Multimodality is another key issues of this seminar: We want to computationally process visual (or multimodal) data. We will talk about accessability problems throughout our data collection classes, and talk about legal and ethical issues on this page.
Keeping these introductory considerations in mind, we immerse into a short outline of two theories: Cultural Analytics and Digital Methods, as foundational elements for social media research. Subsequently, we’ll address the ethical and legal challenges associated with analyzing social media. We’ll conclude the chapter by presenting an array of methodologies. In the related work chapter, you’ll find an overview of research on Instagram and TikTok content, even extending beyond our primary topics of interest.
The intent of this article is to provide a brief introduction to the field of computational social media analysis, tailored for my Winter 2023/24 seminar. It offers only a cursory glance at various theories and methodologies. As such, please do not regard the content of this page as a definitive scientific piece. Instead, view it as a compass to guide and inspire your own research endeavors. For a deeper dive into the theory of Digital Media in Politics and Society see the lecture by Prof. Jungherr.