Detecting Bias in Large Language Models: Fine-Tuned KcBERT
  • Lee, Jun Koo
  • Chung, Tai-Myoung
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

1

초록

The rapid advancement of large language models (LLMs) has brought their natural language processing capabilities to a level comparable to human performance. These models are now extensively used in various societal domains, including education and healthcare. However, despite their versatility, LLMs can produce subjective and normative language, potentially leading to discriminatory outcomes among social groups, particularly through the dissemination of offensive language online. In this paper, we define such phenomena as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments using the Bidirectional Encoder Representations from Transformers (KcBERT) and Korean Language Open Data (KOLD) through template-based Masked Language Modeling (MLM). To quantitatively evaluate these biases, we use the Language Pattern Bias Score (LPBS) and Contextual Bias Score (CBS) metrics. Our results show that, compared to KcBERT, the fine-tuned model exhibits a reduction in ethnic bias but significant alterations in gender and racial biases. To mitigate these societal biases, we propose two methods: First, a data balancing approach during the pre-training phase that adjusts the uniformity of data by aligning the distribution of specific word occurrences and converting surrounding harmful words into non-harmful alternatives. Second, during the in-training phase, we apply Debiasing Regularization by adjusting dropout and regularization parameters, resulting in decreased training loss. Our work highlights the existence of societal biases in Korean language models and demonstrates the importance of language-dependent characteristics in these models. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

키워드

Artificial IntelligenceKcBERTLarge Language ModelNatural Language ProcessingSocial bias
제목
Detecting Bias in Large Language Models: Fine-Tuned KcBERT
저자
Lee, Jun KooChung, Tai-Myoung
DOI
10.1007/978-981-97-8705-0_6
발행일
2025-02
유형
Proceedings Paper
저널명
Lecture Notes in Computer Science
14893 LNCS
페이지
76 ~ 90