Hasija, Udit, Gupta, Vedika
ORCID: https://orcid.org/0000-0002-8109-498X, Nath, Tanusree and Vashishtha, Srishti
(2026)
Finding the Signal: A Deep Dive into Data Augmentation and Transformer Performance for Hope Speech Detection.
In:
Advanced Network Technologies and Intelligent Computing: 5th International Conference, ANTIC 2025, Gwalior, India, December 21–23, 2025, Proceedings, Part II.
Communications in Computer and Information Science
.
Springer, Cham, pp. 237-250.
ISBN 9783032271174
Abstract
Identifying hope speech-text that conveys positivity and empathy online, is vital to building healthier digital communities. This study focuses on the English subset of the HopeEDI dataset, where hope speech accounts for only 8.6% of the data, posing a strong class imbalance challenge. To mitigate this, we apply a synonym replacement–based data augmentation technique and evaluate its effect across ten models, including five classical machine learning classifiers and five transformer-based architectures. Our results show that trans-formers, especially with data augmentation, substantially outperform traditional classifiers, achieving notable gains in recall and F1-score. These findings high-light that lightweight lexical augmentation can effectively enhance minority class recognition in limited datasets. The study contributes practical insights for developing inclusive, positive-content detection systems for online platforms.
| Item Type: | Book Section |
|---|---|
| Uncontrolled Keywords: | Hope Speech | Binary Classification | Class Imbalance | Data Augmentation | Transformer Models | Machine Learning | Social Media |
| Subjects: | Social Sciences and humanities > Business, Management and Accounting > Management of Technology and Innovation Physical, Life and Health Sciences > Computer Science |
| Depositing User: | Mr. Syed Anas Ali |
| Date Deposited: | 02 Jul 2026 05:10 |
| Last Modified: | 02 Jul 2026 05:10 |
| Official URL: | https://doi.org/10.1007/978-3-032-27117-4_13 |
| URI: | https://pure.jgu.edu.in/id/eprint/11901 |
Downloads
Downloads per month over past year
Dimensions
Dimensions