Concept-Grounded Detection of Vaccine Misinformation in Multimodal Content Using Interpretable Vision-Language Models

Thapa, Laxmi ORCID: https://orcid.org/0009-0003-9563-8604, Jain, Aryaman, Koduru, Lakshmojee, Adhikari, Surabhi, Rashid, Junaid, Kim, Jungeun, Thapa, Surendrabikram and Naseem, Usman (2026) Concept-Grounded Detection of Vaccine Misinformation in Multimodal Content Using Interpretable Vision-Language Models. In: 35th ACM Web Conference, WWW Companion 2026, 29 June 2026 - 3 July 2026, Dubai.

[thumbnail of Concept-Grounded Detection of Vaccine Misinformation.pdf]
Preview
Text
Concept-Grounded Detection of Vaccine Misinformation.pdf - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

Vaccine misinformation poses a persistent public health challenge, particularly in visual formats such as memes and infographics that combine text, imagery, and rhetorical cues. While textual misinformation has been widely studied, image-based vaccine misinformation remains comparatively underexplored due to the difficulty of interpreting multimodal signals at scale. In this work, we evaluate how effectively multimodal Large Vision-Language Models (LVLMs) can (i) directly classify vaccination stance from images and (ii) extract interpretable concept-level representations that support more reliable and transparent prediction. Using the VaxMeme dataset of 10,244 annotated images, we compare direct zero-shot LVLM inference against a hybrid framework in which classical machine learning models are trained on LVLM-extracted binary concept features. Our results show that grounding stance prediction in structured concept representations consistently outperforms direct LVLM classification, yielding accuracy improvements of approximately 10 - 17% while enabling explicit inspection of the visual and rhetorical cues driving model decisions. These findings highlight the value of concept-grounded, neuro-symbolic approaches for interpretable multimodal misinformation detection.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Interpretable AI | Multimodal misinformation detection | Vaccine misinformation | Vision-language models | Visual memes
Subjects: Physical, Life and Health Sciences > Computer Science
Physical, Life and Health Sciences > Engineering and Technology
Depositing User: Mr. Syed Anas Ali
Date Deposited: 30 Jun 2026 09:37
Last Modified: 30 Jun 2026 09:37
Official URL: https://doi.org/10.1145/3774905.3795453
URI: https://pure.jgu.edu.in/id/eprint/11866

Downloads

Downloads per month over past year

Actions (login required)

View Item
View Item