For this notebook we use a 4CAT corpus collected from TikTok about the 2024 Farmersβ Protest in Germany. Letβs take a look at all relevant columns. Weβre mostly dealing with the image_file column. Additionally, the images files should be extracted to the /content/media/images/ path. (See linked notebook for the conversion from the original 4CAT files).
Letβs first install bertopic including the vision extensions.
Note
The following code has been taken from the BERTopic documentation and was only slightly changed.
In [4]:
!pip install bertopic[vision]
Images Only
Next, we prepare the pipeline for an image-only model: We want to fit the Topic Model on the image content only. We follow the BERTOpic Multimodal Manual, and generate image captions using the vit-gpt2-image-captioningpackage. The documentation offers a lot of different options, we can incorporate textual content for the topic modeling, or fit the model on textual information only and look for the best matching images for each cluster and display them.
In our example we focus on image-only topics models.
In [6]:
from bertopic.representation import KeyBERTInspired, VisualRepresentationfrom bertopic.backend import MultiModalBackend# Image embedding modelembedding_model = MultiModalBackend('clip-ViT-B-32', batch_size=32)# Image to text representation modelrepresentation_model = {"Visual_Aspect": VisualRepresentation(image_to_text_model="nlpconnect/vit-gpt2-image-captioning")}
Next, select the column with the path of your images files, in my example image_file. Convert it to a python list.