ROSE: Reliability-Optimized OD-ECC and S-ECC Enhancements for HBM3
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

High Bandwidth Memory (HBM) incorporates both On-Die Error Correction Codes (OD-ECC) and System-level ECC (S-ECC) to detect and correct errors. However, this combined approach is inefficient, as the reliability improvements achieved are disproportionate to the resources consumed. To address these limitations, this paper introduces ROSE, an optimized memory protection scheme that enhances both OD-ECC and S-ECC mechanisms. The OD-ECC in ROSE enhances traditional OD-ECC by adding support for both single-symbol and double-bit error correction, effectively covering a wider range of error patterns without additional redundancy or performance overhead. Additionally, S-ECC of ROSE is further strengthened through advanced 32-bit metadata integration, enabling effective detection of Silent Data Corruption (SDC). ROSE achieves 100% correction of double-bit errors and offers robust detection for more severe error patterns, providing a more resource-efficient solution for HBM. © 2025 IEEE.

키워드

HBM3On-Die ECCreliabilitySystem-level ECC
제목
ROSE: Reliability-Optimized OD-ECC and S-ECC Enhancements for HBM3
저자
Shin, JaehoKim, Jungrae
DOI
10.1109/ICEIC64972.2025.10879768
발행일
2025-01
유형
Conference paper
저널명
2025 International Conference on Electronics, Information, and Communication, ICEIC 2025