Interdisciplinary Approaches to Social Media Analysis

Author

Michael Achmann-Denkler

Published

October 23, 2023

Social Media Analyses (SMA) are used both, in academia and in professional settings. Depending on the research agenda, different methodologies may be applied (Kanthawala et al. 2022; Rejeb et al. 2022). In our course, we focus on the academic exploration of Social Media. We place particular emphasis on questions related to media, politics, and society. This represents a confluence of communication science and political science, intertwined with computational methods.

Social Media Analyses in different contexts

Bridging this discussion, there are several disciplines pivotal to the academic analysis of social media data at this intersection: Lazer et al. (2009) outlined in an influencial article computational social science as an emerging field that built on the ability to collect and analyze vast amounts of data. The goal of the computational social science, according to this article, is to reveal patterns in human interactions, benefiting from various data sources such as emails, phone records, online social networks, and other digital traces left by individuals. We are going to concentrate on social media data, a type of data described by Quan-Haase and Sloan (2022a) as incidental, since the data exists and is being created, no matter the researchers observing them – or not. One special type of data, Instagram stories, even have an ephemeral character. 24 hours after posting the story expires – becoming invisible for followers and researchers alike (see also Leaver, Highfield, and Abidin 2020 on the importance of stories). Atteveldt and Peng (2018) noted a surge in the use of computational methods in communication science, attributing it to three primary factors: the availability of digital data, sophisticated data analysis tools, and the emergence of cost-effective, potent processing capabilities complemented by accessible computing infrastructure. Building on this perspective, Haim (2023) sees the computational communication science as a sub-discipline of communication science that addresses digitally altered objects of research, which require computational approaches to tackle to amount and complexity of this special type of data.

In the realm of digital humanities, computational approaches to text analysis have a long history, influenced by concepts such as distant reading (Moretti 2000) and macroanalysis (Jockers 2013). Manovich picks up these concepts in his cultural analytics, see below. Lately also distant viewing has been outlined, as “a methodological and theoretical framework for the study of large collections of visual materials” (Arnold and Tilton 2019). I see potential in integrating approaches and methods from the digital humanities into social media analysis. Vice versa, there’s also potential in utilizing methods used for social media analysis to address questions in the humanities.

Challenges for social media analyses have been outlined by Quan-Haase and Sloan (2022a): the role of theory, representativeness of data, scale, multimodality, data accessability, and legal and ethical considerations. Through our semester we are going to work on several of those challenges: In the Operationalization session we are going to talk about data-driven approaches (bearing in mind Anderson et al. 2008), as well as theories as basis for your research questions and operationalizations. The representativeness of data will be the challenge for our data collection sessions: We will not just answer how to collect data, but also what data to collect. The two challenges left are at the centre of our seminar: Our answer for the challenge of scale is to apply computational methods for data analysis, to process data at scale. Multimodality is another key issues of this seminar: We want to computationally process visual (or multimodal) data. We will talk about accessability problems throughout our data collection classes, and talk about legal and ethical issues on this page.

Keeping these introductory considerations in mind, we immerse into a short outline of two theories: Cultural Analytics and Digital Methods, as foundational elements for social media research. Subsequently, we’ll address the ethical and legal challenges associated with analyzing social media. We’ll conclude the chapter by presenting an array of methodologies. In the related work chapter, you’ll find an overview of research on Instagram and TikTok content, even extending beyond our primary topics of interest.

Note

The intent of this article is to provide a brief introduction to the field of computational social media analysis, tailored for my Winter 2023/24 seminar. It offers only a cursory glance at various theories and methodologies. As such, please do not regard the content of this page as a definitive scientific piece. Instead, view it as a compass to guide and inspire your own research endeavors. For a deeper dive into the theory of Digital Media in Politics and Society see the lecture by Prof. Jungherr.

Cultural Analytics

Cultural analytics, as explained in the introductory chapter of the book “Cultural Analytics” by Lev Manovich, is a field that uses computers to analyze and understand large amounts of cultural information or “big cultural data”. This might include exploring big collections of images, videos, or other media data to see patterns and trends that are happening in digital culture. Manovich talks about some key questions and challenges in cultural analytics. For example, one big question is whether we should focus on finding common themes and patterns in our data, or whether we should pay more attention to things that are unusual or rare. Also, while cultural analytics can be a powerful tool for understanding aspects of culture, especially in the digital world, Manovich tells us to be aware of its limits. He says that computers and data analysis can tell us a lot, but they can’t understand culture in the rich and deep way that humans can, especially when it comes to understanding things like aesthetics (beauty, style, etc.). So, while cultural analytics can help us see large scale patterns and trends in culture, Manovich advises us to also appreciate and be aware of what it can’t see or understand. The field of cultural analytics then becomes a space where we use computational tools to explore and question culture, while also being mindful of the limitations and challenges of using these tools (Manovich 2020).

Digital Methods

“Digital Methods,” as introduced by Rogers (2013), proposes a paradigm wherein the internet is both a site and a source for research, especially for social media studies. Unlike conventional research approaches that see the internet merely as a tool or data source, Rogers advocates for a methodology that is intrinsically web-centric, understanding and employing the unique dynamics and mechanics of the digital medium itself. An example for a digital methods research project is understanding algorithmic operations, especially of search engines like Google, and comprehending their impact on digital culture, information accessibility, and user engagement. This perspective is important to explore the foundations of how information is organized, ranked, and accessed online. Studying the digital medium itself means to study web-native phenomena such as hyperlink networks, search engine behaviors, and social media activities to uncover patterns, tendencies, and hierarchical structures within digital cultures and societies.

The concepts of cultural analytics and digital methods will guide us through our semester and our projects: We borrow the idea to use computational methods in order to understand “big cultural data” form Manovich and the concept of studying the digital medium itself from Rogers. Throughout the semester will enrich our projects through your own literature and theory based on the research interests. Beyond these foundations, we will borrow from i.e. the Computational Social Sciences (Lazer et al. 2009), the concept of Distant Viewing (Arnold and Tilton 2019), or Grammars of Action (Agre 1994; Gerlitz and Rieder 2018; Bainotti, Caliandro, and Gandini 2020; Omena, Rabello, and Mintz 2020), and Platform Vernaculars (Gibbs et al. 2015).

Legal & Ethical Challenges

Warning

This subchapter scratches the surface. Recommended reading: Haim (2023) pp. 62–69; 126–128.

When working with social media data, we’re dealing with personal information. As such we need to take into account legal and ethical considerations. From the legal perspective we need to focus on two aspects: The ownership of the data, and – when dealing with personal data – the GDPR. For the latter we need to take into account consent and should think about pseudonymisation or anonymisation of our data (Haim 2023). Further, the German Urheberrecht, the equivalent of the anglo-saxon copyright law (there are important differences, see Bundeszentrale für politische Bildung for a synopsis), defines exceptions for scientific research: I recommend the publication by Rat für Sozial- und Wirtschaftsdaten (RatSWD) (2019) which takes a closer look at the database law and provides some practical guidance (more in our slides).

The importance of the legal perspective social media research grew recently: Following the Cambridge Analytica scandal Meta platforms (like Instagram) started closing down on APIs, which would have offered a legal and accepted (by the plattform) point of access for researchers. I recommend to read McCrow-Young’s (2021) article, as she demonstrates how academic research may be interrupted by platform changes, like the closure of the Instagram-API in the wake of above incident. Post-API social media research found creative ways to access the data: Bainotti, Caliandro, and Gandini (2020), for example, took a unique approach for data collection by capturing Instagram content through YouTube videos. Recent publications on Instagram analyses, and most approaches in our future session, rely on crawling and scraping. Venturini and Rogers (2019) see a chance in the API-closure and argue that these techniques are “more than a ‘necessary evil’”, as it might force researchers to come back to (digital) field work.

Finally a word about reserach ethics. While the GDPR provides a rigid legal framework for dealing with personal information, I’d like to recommend the article “But the Data is Already Public” by Zimmer (2010). The article documents how, in a matter of days, an anonymous dataset of 1700 facebook profiles became (partly) deanonymized. Based on this case study, the author compiles ethical concerns for future research, which we should also incorporate into our work.

Methodology

In this chapter we are going to take a look at different methods for use with social media research, and particularly, with our projects. We are going to use (Visual) Content Analysis to understand the content of posts and stories. The concept of Plattform Affordances will help us understand these posts and stories as embedded in the platform and its available functions and options. Finally, the idea of Platform Vernaculars & Grammars serves as a guide to wire everything up, to discover patterns and trends in how users communicate and engage on these platforms.

(Visual) Content Analysis

We are going to apply quantitative content analyses to our corpora. For a quantitative approach we are going to operationalize our theory-based interests and questions using formal and / or content features. Next, we need to apply the operationalization to the documents, in form of human annotations or computational coding (see Döring and Bortz 2016). Döring and Bortz (2016) outline a general approach to content analysis, Rose (2016) in contrast concentrates on visual content analyses. She suggests four steps:

“Finding your Images.

Devising your categories for coding.

Coding the images.

Analysing the results.” – (Rose 2016 ch. 5)

The challenge of the first step is the sampling: Even with computational approaches, is it feasible to collect everything? The cultural analytics approach suggests such a goal, e.g. in order to obtain data and traces of subcultures. Due to practical limitations also Manovich’s works use an approach to break the large amount of available data into a smaller portion (see Hochman and Manovich 2013). This approach is called sampling, Rose (2016) introduces several sampling approaches like random, stratified, systematic, or cluster sampling. Döring and Bortz (2016) provide a deeper look into sampling strategies.

The codes, for the second step, may be devised from a qualitative exploration of the data or theories and related work. In context of our projects we are going to use both approaches: We will annotate a subset of our data as ground-truth while coding the total data using computational approaches. On code development there exists another large body of literature, like the Grounded Theory (e.g. Corbin and Strauss 2008) and Ethnic Coding Approach (Altheide 1987).

For the final analysis we are going to apply statistical data analyses. For an initial understanding of our data we will start with some exploratory analyses, e.g. plotting the data. In combination with the two approaches below, the platform affordances and platform vernaculars & grammars we may discover patterns of social media use. In most cases, our projects will compare different groups: These groups might be different user types (e.g. Politician Accounts vs. Party Accounts), or different Posts types (e.g. Posts vs. Stories), or different platforms (e.g. Instagram vs. TikTok).

Platform Affordances

Bossetta (2018) provides an overview of the concept of affordances and their application in social media analyses. He traces the term back to boyd and Papacharissi & Yuan who argued “that digital communication tech- nologies provide structural affordances to agents” (p. 473 Bossetta 2018). There are two important take-aways from his work: 1) The concept of affordances is not used consistently, and 2) the platforms shape affordances and thereby how users interact with the platform. Bainotti, Caliandro, and Gandini (2020) used the “Instagram-specific digital objects” as codes for their analysis of stories, linking the concept of affordances in the context of Instagram to the use of stickers.

In the context of our seminar we might consider the following elements as platform affordances:

TikTok	IG – Posts	IG – Stories
Likes	Likes	Sliders
Comments	Comments	Votes
Shares	Views	Questions
Music	Mentions	Mentions
Hashtags	Hashtags	Hashtags
…	…	Locations
		…

Question

Did you spot the difference between some of the listed affordances? Likes and comments, for instance, are reactions to posts. Would you consider these features as affordances? Let’s discuss this is in class!

Platform Vernaculars & Grammars

Previous studies have looked into ‘grammars’ in Instagram stories. Originally linked to research on privacy (Agre 1994), grammars classify activities using specific types, making data collection and analysis easier. This uncovers patterns in user behavior, beneficial for purposes such as advertising. To the best of my knowledge, this concept was first used for social media data by Gerlitz and Rieder (2018) in a Twitter study.

Omena, Rabello, and Mintz (2020) discussed a “grammar of hashtags”, referring to the rules of hashtag use and how they’re organized on platforms. They suggest that hashtags, content visibility, and the nature of the content itself are essential in understanding hashtag use. Meanwhile, Bainotti, Caliandro, and Gandini (2020) used grammars to understand Instagram Stories, focusing on visual elements and their cultural meanings.

Lastly, Gibbs et al. (2015) examined the unique styles and logics of social media, termed “platform vernaculars”. These are influenced both by platform features and user habits.

Summary

In this chapter we have positioned ourselves between several disciplines: The computational social science, computational communication science, and digital humanities. In this position, we see social media data as trace data of human and social behaviour. The digitalness of our subject is, however, just one side of the coin: Follwing the theoretical frameworks of Digital Methods and Cultural Analytics, we want to conduct our analyses computationally with the aim to uncover patterns and trends of user behaviour on social media plattforms. Methodologically we can draw from quantitative content analysis, and the concept of platform affordances as features, and apply the concept of platform vernaculars and grammars to make sense of these features.

Additional Resources

Conferences

International Conference on Social Media & Society
IC²S² 2022
ICWSM
AoIR
WebSci
International Conference on CMC and Social Media Corpora for the Humanities

Journals

New Media & Society
Big Data & Society

Textbooks

Rose (2016): Visual Methodologies: An Introduction to Researching with Visual Materials.
Haim (2023): Computational Communication Science: Eine Einführung.
Quan-Haase and Sloan (2022b): The SAGE handbook of social media research methods.

Online Resources

Richard Rogers: Social Media Research with Digital Methods (YouTube)

Note

Do you know of any ressources to be added to this list? Drop me a line: michael.achmann@ur.de.

References

Agre, Philip E. 1994. “Surveillance and capture: Two models of privacy.” The Information Society 10 (2): 101–27. https://doi.org/10.1080/01972243.1994.9960162.

Altheide, David L. 1987. “Reflections: Ethnographic content analysis.” Qualitative Sociology 10 (1): 65–77. https://doi.org/10.1007/BF00988269.

Anderson, Chris, Medea Giordano, Matt Jancer, Philip Ball, Will Knight, Sassafras Lowrey, and Laurence Scott. 2008. “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete.” Wired, June. https://www.wired.com/2008/06/pb-theory/.

Arnold, Taylor, and Lauren Tilton. 2019. “Distant viewing: analyzing large visual corpora.” Digital Scholarship in the Humanities 34 (Supplement_1): i3–16. https://doi.org/10.1093/llc/fqz013.

Atteveldt, Wouter van, and Tai-Quan Peng. 2018. “When Communication Meets Computation: Opportunities, Challenges, and Pitfalls in Computational Communication Science.” Communication Methods and Measures 12 (2-3): 81–92. https://doi.org/10.1080/19312458.2018.1458084.

Bainotti, Lucia, Alessandro Caliandro, and Alessandro Gandini. 2020. “From archive cultures to ephemeral content, and back: Studying Instagram Stories with digital methods.” New Media & Society, September, 1461444820960071. https://doi.org/10.1177/1461444820960071.

Bossetta, Michael. 2018. “The Digital Architectures of Social Media: Comparing Political Campaigning on Facebook, Twitter, Instagram, and Snapchat in the 2016 U.S. Election.” Journalism & Mass Communication Quarterly 95 (2): 471–96. https://doi.org/10.1177/1077699018763307.

Corbin, Juliet M, and Anselm L Strauss. 2008. Basics of qualitative research: techniques and procedures for developing grounded theory. Sage Publications, Inc.

Döring, Nicola, and Jürgen Bortz. 2016. Forschungsmethoden und Evaluation in den Sozial- und Humanwissenschaften. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41089-5.

Gerlitz, and Rieder. 2018. “Tweets are not created equal: Investigating Twitter’s client ecosystem.” International Journal of Communication Systems, no. 12: 528–47. https://pure.uva.nl/ws/files/23266519/5974_30096_2_PB.pdf.

Gibbs, Martin, James Meese, Michael Arnold, Bjorn Nansen, and Marcus Carter. 2015. “#Funeral and Instagram: death, social media, and platform vernacular.” Information, Communication and Society 18 (3): 255–68. https://doi.org/10.1080/1369118X.2014.987152.

Haim, Mario. 2023. Computational Communication Science: Eine Einführung. Springer Fachmedien Wiesbaden.

Hochman, Nadav, and Lev Manovich. 2013. “Zooming into an Instagram City: Reading the local through social media.” First Monday, June. https://doi.org/10.5210/fm.v18i7.4711.

Jockers, Matthew L. 2013. Macroanalysis: Digital Methods and Literary History. University of Illinois Press.

Kanthawala, Shaheen, Kelley Cotter, Kali Foyle, and J R Decook. 2022. Proceedings of the 55th Hawaii international conference on system sciences. Proceedings of the ... Annual Hawaii International Conference on System Sciences. Annual Hawaii International Conference on System Sciences. Hawaii International Conference on System Sciences. https://doi.org/10.24251/hicss.2022.000.

Lazer, David, Alex Pentland, Lada Adamic, Sinan Aral, Albert-Laszlo Barabasi, Devon Brewer, Nicholas Christakis, et al. 2009. “Social science. Computational social science.” Science 323 (5915): 721–23. https://doi.org/10.1126/science.1167742.

Leaver, Tama, Tim Highfield, and Crystal Abidin. 2020. Instagram: Visual Social Media Cultures. John Wiley & Sons.

Manovich, Lev. 2020. Cultural Analytics. MIT Press.

McCrow-Young, Ally. 2021. “Approaching Instagram data: reflections on accessing, archiving and anonymising visual social media.” Communication Research and Practice 7 (1): 21–34. https://doi.org/10.1080/22041451.2020.1847820.

Moretti, Franco. 2000. “Conjectures on World Literature.” New Left Review II (1): 54–68. https://newleftreview.org/issues/ii1/articles/franco-moretti-conjectures-on-world-literature.

Omena, Janna Joceli, Elaine Teixeira Rabello, and André Goes Mintz. 2020. “Digital Methods for Hashtag Engagement Research.” Social Media + Society 6 (3): 2056305120940697. https://doi.org/10.1177/2056305120940697.

Quan-Haase, Anabel, and Luke Sloan. 2022a. “Chapter 1: Introduction.” In The SAGE handbook of social media research methods, edited by Anabel Quan-Haase and Luke Sloan, 2nd ed., 1–9. London, England: SAGE Publications. https://doi.org/10.4135/9781529782943.

———. 2022b. The SAGE handbook of social media research methods. Edited by Anabel Quan-Haase and Luke Sloan. 2nd ed. London, England: SAGE Publications. https://doi.org/10.4135/9781529782943.

Rat für Sozial- und Wirtschaftsdaten (RatSWD). 2019. “Big Data in den Sozial-, Verhaltens- und Wirtschaftswissenschaften: Datenzugang und Forschungsdatenmanagement - Mit Gutachten "Web Scraping in der unabhängigen wissenschaftlichen Forschung".” RatSWD Output. German Data Forum ( RatSWD). https://doi.org/10.17620/02671.39.

Rejeb, Abderahman, Karim Rejeb, Alireza Abdollahi, and Horst Treiblmaier. 2022. “The big picture on Instagram research: Insights from a bibliometric analysis.” Telematics and Informatics 73 (September): 101876. https://doi.org/10.1016/j.tele.2022.101876.

Rogers, Richard. 2013. Digital Methods. MIT Press.

Rose, Gillian. 2016. Visual Methodologies: An Introduction to Researching with Visual Materials. SAGE Publications.

Venturini, Tommaso, and Richard Rogers. 2019. “‘API-Based Research’ or How can Digital Sociology and Journalism Studies Learn from the Facebook and Cambridge Analytica Data Breach.” Digital Journalism 7 (4): 532–40. https://doi.org/10.1080/21670811.2019.1591927.

Zimmer, Michael. 2010. “"But the Data is Already Public": On the Ethics of Research in Facebook.” Ethics and Information Technology 12 (4): 313–25. https://doi.org/10.1007/s10676-010-9227-5.

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{achmann-denkler2023,
  author = {Achmann-Denkler, Michael},
  title = {Interdisciplinary {Approaches} to {Social} {Media}
    {Analysis}},
  date = {2023-10-23},
  url = {https://social-media-lab.net/getting-started/theory.html},
  doi = {10.5281/zenodo.10039756},
  langid = {en}
}

For attribution, please cite this work as:

Achmann-Denkler, Michael. 2023. “Interdisciplinary Approaches to Social Media Analysis.” October 23, 2023. https://doi.org/10.5281/zenodo.10039756.