Concept-Grounded Detection of Vaccine Misinformation in Multimodal Content Using Interpretable Vision-Language Models

Dimensions

Thapa, Laxmi ORCID: https://orcid.org/0009-0003-9563-8604, Jain, Aryaman, Koduru, Lakshmojee, Adhikari, Surabhi, Rashid, Junaid, Kim, Jungeun, Thapa, Surendrabikram and Naseem, Usman (2026) Concept-Grounded Detection of Vaccine Misinformation in Multimodal Content Using Interpretable Vision-Language Models. In: 35th ACM Web Conference, WWW Companion 2026, 29 June 2026 - 3 July 2026, Dubai. Available at: https://doi.org/10.1145/3774905.3795453

[thumbnail of Concept-Grounded Detection of Vaccine Misinformation.pdf]

Preview

Text
Concept-Grounded Detection of Vaccine Misinformation.pdf - Published Version
Available under License Creative Commons Attribution.
Download (1MB) | Preview

Abstract

Vaccine misinformation poses a persistent public health challenge, particularly in visual formats such as memes and infographics that combine text, imagery, and rhetorical cues. While textual misinformation has been widely studied, image-based vaccine misinformation remains comparatively underexplored due to the difficulty of interpreting multimodal signals at scale. In this work, we evaluate how effectively multimodal Large Vision-Language Models (LVLMs) can (i) directly classify vaccination stance from images and (ii) extract interpretable concept-level representations that support more reliable and transparent prediction. Using the VaxMeme dataset of 10,244 annotated images, we compare direct zero-shot LVLM inference against a hybrid framework in which classical machine learning models are trained on LVLM-extracted binary concept features. Our results show that grounding stance prediction in structured concept representations consistently outperforms direct LVLM classification, yielding accuracy improvements of approximately 10 - 17% while enabling explicit inspection of the visual and rhetorical cues driving model decisions. These findings highlight the value of concept-grounded, neuro-symbolic approaches for interpretable multimodal misinformation detection.

Item Type:	Conference or Workshop Item (Paper)
Uncontrolled Keywords:	Interpretable AI \| Multimodal misinformation detection \| Vaccine misinformation \| Vision-language models \| Visual memes
Subjects:	Physical, Life and Health Sciences > Computer Science Physical, Life and Health Sciences > Engineering and Technology
Depositing User:	Mr. Syed Anas Ali
Date Deposited:	30 Jun 2026 09:37
Last Modified:	30 Jun 2026 09:37
Official URL:	https://doi.org/10.1145/3774905.3795453
URI:	https://pure.jgu.edu.in/id/eprint/11866

Downloads

Downloads per month over past year

Actions (login required)

: View Item