PoP-ECC: Robust and Flexible Error Correction against Multi-Bit Upsets in DNN Accelerators
  • Park, Taewon
  • Gorgin, Saeid
  • Kim, Dongwhee
  • Shin, Jaeho
  • Sullivan, Michael B.
  • ... Kim, Jungrae
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Deep Neural Networks (DNNs) in safety-critical systems require high reliability. Many systems deploy Error Correction Codes (ECCs) to protect DNNs from memory errors. However, continuous process scaling increases memory errors in severity and frequency, necessitating strong protection against Multi-Bit Upsets (MBUs). This paper proposes Parities of Parities ECC (PoP-ECC), a novel two-tier memory protection scheme designed to provide robust, efficient, and flexible protection against MBUs. PoP-ECC generates Virtual Parities (VPs), which are used to compute secondlevel parities called Parities of Parities (PPs). This two-level ECC structure allows for dynamic error correction tailored to varying error patterns, ensuring system reliability with minimal memory overhead. Our evaluation demonstrates that PoP-ECC can tolerate significantly higher MBU ratios compared to state-of-the-art solutions, with negligible delay, area, and power overhead.

키워드

Deep Neural NetworkECCReliabilitySRAM
제목
PoP-ECC: Robust and Flexible Error Correction against Multi-Bit Upsets in DNN Accelerators
저자
Park, TaewonGorgin, SaeidKim, DongwheeShin, JaehoSullivan, Michael B.Kim, Jungrae
DOI
10.1109/DAC63849.2025.11133373
발행일
2025-06
유형
Proceedings Paper
저널명
Proceedings - Design Automation Conference