Nath, Tanusree, Gupta, Vedika
ORCID: https://orcid.org/0000-0002-8109-498X and Gupta, Manjari
(2026)
Do Memes Speak Hate? A Residual-Adapter Approach for Bengali Memes.
In:
Intelligent Human Computer Interaction: 17th International Conference, IHCI 2025, Jaipur, India, November 14–16, 2025, Revised Selected Papers, Part I Conference proceedings.
Lecture Notes in Computer Science
.
Springer Science and Business Media, Berlin, pp. 225-237.
ISBN 9783032263483
Abstract
The widespread presence of social media in everyday life has led to a surge in online interactions. While it offers numerous benefits, social media is frequently misused as a platform to propagate hate and negativity, which can have detrimental effects on users’ mental well-being. Although several tools exist to detect harmful content, low-resource languages like Bengali remain significantly under-resourced in such efforts. Given that contemporary social media content often spans multiple modalities like text and images, a multimodal approach offers a more robust solution for automatic content moderation. This study presents a multimodal hateful meme detection model tailored for Bengali, which simultaneously processes textual and visual information. The Res-AFFNet (Residual-Adapter Feature-level Fusion Network) architecture employs MuRIL (Multilingual representations for Indian languages) and ViT (Vision Transformer) to process text and image, respectively. Each modality’s embeddings are passed through lightweight adapter units, and a residual fusion (80% original +20% adapted) is applied. The adapter outputs are fused and passed through a linear projection and classifier to predict whether the input is ‘hate’ or ‘not-hate’. The multimodal approach outperforms unimodal approaches in terms of accuracy. Trained and evaluated on the MUTE dataset, the model achieves an improvement of 1.27% over the existing baseline. Additionally, ablation studies highlight the effectiveness of the multimodal framework compared to unimodal and reduced-component variants. Multimodal Bengali hate speech detection can be used for automatic content moderation on social networking websites and help support low-resource communities. One future scope of this study is that this framework could be extended to other low-resource languages. Fine-grained classification of hate speech can also be incorporated in this framework by expanding the dataset annotations.
| Item Type: | Book Section |
|---|---|
| Uncontrolled Keywords: | Bengali Hateful Memes | Low-Resource Language NLP | Multimodal Hate Speech Detection |
| Subjects: | Physical, Life and Health Sciences > Computer Science |
| Depositing User: | Mr. Syed Anas Ali |
| Date Deposited: | 01 Jul 2026 04:47 |
| Last Modified: | 01 Jul 2026 04:47 |
| Official URL: | https://doi.org/10.1007/978-3-032-26349-0_19 |
| URI: | https://pure.jgu.edu.in/id/eprint/11880 |
Downloads
Downloads per month over past year
Dimensions
Dimensions