상세 보기
- Juarez, Maddie;
- Abdukhamidov, Eldor;
- Sandoval, Manuel;
- Nazari, Mujtaba;
- Hall, Deborah;
- ... Abuhmed, Tamer;
- 외 3명
WEB OF SCIENCE
0SCOPUS
0초록
Cyberbullying on social networking sites has become more prevalent. Most cyberbullying detection models often lack consideration of adversarial threads, leaving them vulnerable. This study evaluates the resilience of text-based cyberbullying detection models, constrained by limited available datasets, against word-level substitutions and character-level perturbations. We consider well-established ML techniques with real-world data and more recent LLM-based approaches to uncover model weaknesses. The results reveal that adversarial attacks can significantly reduce detection accuracy, e.g., most models are vulnerable to word- and character-level attacks with success rates up to 88% and 44%, respectively. We also find that LLM-based models such as CyberBERT are more resistant to both types of attack while maintaining strong detection performance. We show that model architecture and text vectorization choices significantly impact attack resistance and that adversarial training can help improve robustness, with tailored combinations of models and vectorizers showing the best results. These findings can guide the development of safer online platforms, as tailored strategies can make cyberbullying detection models more resilient and effective.
키워드
- 제목
- Analyzing Adversarial Strategies and Countermeasures for Cyberbullying Detection
- 저자
- Juarez, Maddie; Abdukhamidov, Eldor; Sandoval, Manuel; Nazari, Mujtaba; Hall, Deborah; Thiruvathukal, George K.; Abuhmed, Tamer; Silva, Yasin N.; Abuhamad, Mohammed
- 발행일
- 2026
- 유형
- Proceedings Paper
- 권
- 16127 LNCS
- 페이지
- 86 ~ 95