Poster: Insecure Coding Habits Die Hard. Can PEFT Really Turn LLMs into Secure Coders?
  • Chae, Sangjun
  • Choi, Jangseop
  • Kim, Taeyang
  • Jung, Eun
  • Oh, Sanghak
  • ... Kim, Hyoungshick
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Large language models (LLMs) have advanced automated code generation but often produce code with critical security flaws, including buffer overflows, memory leaks, and unsafe file handling. While prior work emphasizes post-hoc vulnerability detection, we introduce a framework for secure-by-construction code generation via parameter-efficient fine-tuning (PEFT). We construct a secure training dataset by automatically fixing 7 high-impact vulnerability types in 37,540 C code samples from CodeNet, achieving 95.36% CWE reduction. We then apply prompt and prefix tuning to four open-source models (CodeGen-16B/6B-multi and StarCoder2-7B/3B), updating fewer than 1% of the parameters. On the LLMSecEval benchmark, our approach increases secure code generations from 20 to 36 for StarCoder2-3B and from 10 to 27 for CodeGen-6B. These results demonstrate that PEFT can substantially improve code security without full model retraining.

키워드

Large language modelsSecure code generation
제목
Poster: Insecure Coding Habits Die Hard. Can PEFT Really Turn LLMs into Secure Coders?
저자
Chae, SangjunChoi, JangseopKim, TaeyangJung, EunOh, SanghakKim, Hyoungshick
DOI
10.1145/3719027.3760716
발행일
2025
유형
Proceedings Paper
저널명
CCS 2025 - Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security
페이지
4758 ~ 4760