상세 보기
Jailbreaking LLMs Through Cross-Cultural Prompts
- Kim, Damin;
- Hur, Minseok;
- Lee, Jeongin;
- Min, Moohong
Citations
WEB OF SCIENCE
0Citations
SCOPUS
0초록
We examine how linguistic and cultural framing affect jailbreak success in three commercial LLMs (GPT-4, Claude 3, Gemini), using semantically equivalent prompts in direct, indirect, and metaphorical styles across four high-resource languages. Indirect prompts most effectively bypassed filters, with framing and style significantly influencing alignment. GPT-4 was especially vulnerable to indirect framing, Claude 3 remained consistently robust, and Gemini showed high sensitivity to cultural and linguistic variation. Our findings highlight the need for alignment strategies resilient to diverse expression styles and cultural contexts.
키워드
alignment; cross-cultural prompts; jailbreaking; large language models
- 제목
- Jailbreaking LLMs Through Cross-Cultural Prompts
- 저자
- Kim, Damin; Hur, Minseok; Lee, Jeongin; Min, Moohong
- 발행일
- 2025
- 유형
- Proceedings Paper
- 저널명
- CIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
- 페이지
- 4874 ~ 4878