Analyzing Adversarial Strategies and Countermeasures for Cyberbullying Detection
  • Juarez, Maddie
  • Abdukhamidov, Eldor
  • Sandoval, Manuel
  • Nazari, Mujtaba
  • Hall, Deborah
  • ... Abuhmed, Tamer
  • 외 3명
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Cyberbullying on social networking sites has become more prevalent. Most cyberbullying detection models often lack consideration of adversarial threads, leaving them vulnerable. This study evaluates the resilience of text-based cyberbullying detection models, constrained by limited available datasets, against word-level substitutions and character-level perturbations. We consider well-established ML techniques with real-world data and more recent LLM-based approaches to uncover model weaknesses. The results reveal that adversarial attacks can significantly reduce detection accuracy, e.g., most models are vulnerable to word- and character-level attacks with success rates up to 88% and 44%, respectively. We also find that LLM-based models such as CyberBERT are more resistant to both types of attack while maintaining strong detection performance. We show that model architecture and text vectorization choices significantly impact attack resistance and that adversarial training can help improve robustness, with tailored combinations of models and vectorizers showing the best results. These findings can guide the development of safer online platforms, as tailored strategies can make cyberbullying detection models more resilient and effective.

키워드

Adversarial attackCountermeasuresCyberbullying
제목
Analyzing Adversarial Strategies and Countermeasures for Cyberbullying Detection
저자
Juarez, MaddieAbdukhamidov, EldorSandoval, ManuelNazari, MujtabaHall, DeborahThiruvathukal, George K.Abuhmed, TamerSilva, Yasin N.Abuhamad, Mohammed
DOI
10.1007/978-3-032-07715-8_9
발행일
2026
유형
Proceedings Paper
저널명
Lecture Notes in Computer Science
16127 LNCS
페이지
86 ~ 95