TY - GEN
T1 - MemeGraphs
T2 - 17th International Conference on Document Analysis and Recognition, ICDAR 2023
AU - Kougia, Vasiliki
AU - Fetzel, Simon
AU - Kirchmair, Thomas
AU - Çano, Erion
AU - Baharlou, Sina Moayed
AU - Sharifzadeh, Sahand
AU - Roth, Benjamin
N1 - Funding Information:
Acknowledgements. This research was funded by the Deutsche Forschungsgemein-schaft (DFG, German Research Foundation) - RO 5127/2-1 and the Vienna Science and Technology Fund (WWTF)[10.47379/VRG19008]. We thank Christos Bintsis for participating in the manual augmentation. We also thank Matthias Aßenmacher and the anonymous reviewers for their valuable feedback.
Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Memes are a popular form of communicating trends and ideas in social media and on the internet in general, combining the modalities of images and text. They can express humor and sarcasm but can also have offensive content. Analyzing and classifying memes automatically is challenging since their interpretation relies on the understanding of visual elements, language, and background knowledge. Thus, it is important to meaningfully represent these sources and the interaction between them in order to classify a meme as a whole. In this work, we propose to use scene graphs, that express images in terms of objects and their visual relations, and knowledge graphs as structured representations for meme classification with a Transformer-based architecture. We compare our approach with ImgBERT, a multimodal model that uses only learned (instead of structured) representations of the meme, and observe consistent improvements. We further provide a dataset with human graph annotations that we compare to automatically generated graphs and entity linking. Analysis shows that automatic methods link more entities than human annotators and that automatically generated graphs are better suited for hatefulness classification in memes.
AB - Memes are a popular form of communicating trends and ideas in social media and on the internet in general, combining the modalities of images and text. They can express humor and sarcasm but can also have offensive content. Analyzing and classifying memes automatically is challenging since their interpretation relies on the understanding of visual elements, language, and background knowledge. Thus, it is important to meaningfully represent these sources and the interaction between them in order to classify a meme as a whole. In this work, we propose to use scene graphs, that express images in terms of objects and their visual relations, and knowledge graphs as structured representations for meme classification with a Transformer-based architecture. We compare our approach with ImgBERT, a multimodal model that uses only learned (instead of structured) representations of the meme, and observe consistent improvements. We further provide a dataset with human graph annotations that we compare to automatically generated graphs and entity linking. Analysis shows that automatic methods link more entities than human annotators and that automatically generated graphs are better suited for hatefulness classification in memes.
KW - hate speech
KW - internet memes
KW - knowledge graphs
KW - multimodal representations
UR - http://www.scopus.com/inward/record.url?scp=85172248687&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-41676-7_31
DO - 10.1007/978-3-031-41676-7_31
M3 - Contribution to proceedings
AN - SCOPUS:85172248687
SN - 9783031416750
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 534
EP - 551
BT - Document Analysis and Recognition – ICDAR 2023
A2 - Fink, Gernot A.
A2 - Jain, Rajiv
A2 - Kise, Koichi
A2 - Zanibbi, Richard
PB - Springer Science and Business Media Deutschland GmbH
CY - Cham
Y2 - 21 August 2023 through 26 August 2023
ER -