INFO 230: Mini-Project 3

Project Materials

Hi Professor Tim! The full analysis, code, and write-up live in the notebook. The README has setup instructions and repo structure if helpful.

📓

View Notebook Full analysis, code, and write-up

📋

View README Setup instructions and repo structure

Summary

Memes are not trivial internet content. They assign blame, invert the tone of political speech, and circulate ideological positions in a format designed for rapid sharing. This project treats them as genuine cultural artifacts and applies a full multimodal pipeline across three datasets and eight steps of analysis to find out what they are actually doing.

The approach combines text analysis on 5,552 meme captions, image analysis on 16 sample PNGs, speech analysis on 1,081 campaign transcripts, and acoustic feature extraction from 17 actual presidential debate audio clips, then asks whether the patterns in each modality point to the same underlying story.

Key Findings

01

Trump dominates the meme data as the top villain in both political and COVID memes simultaneously, appearing as hero and victim too, depending on who made the meme.

02

Meme culture inverts political speech. Every politician speaks more positively than the memes about them suggest. Memes strip the optimism and redirect it as blame.

03

Blame vocabulary is sourced from real speeches. Trump's China-framing in 2020 speeches maps directly onto the meme villain data, where China is the second most common villain tag.

04

Acoustic energy tracks villain framing. Trump's higher zero crossing rate (a measure of speech energy) aligns with his dramatically higher villain count in memes.

Datasets

Memes Images: OCR Data

5,552 political and COVID memes with OCR-extracted text and crowd-annotated entity tags (hero, villain, victim) plus 16 sample PNG images. Kaggle, yogesh239.

2020 US Presidential Election Campaign Speeches

1,081 cleaned campaign speeches from Trump, Biden, Pence, and Harris spanning January 2019 through January 2021. Chalkiadakis et al., Scientific Data, 2025.

M-Arg Multimodal Argumentation Dataset

17 MP3 audio clips from the 5 US 2020 presidential debates with force-aligned utterance timestamps and speaker labels. Mestre et al., ACL, 2021.

Beyond the Joke: A Multimodal Analysis of Blame and Framing in Political Memes