Scaling Out Chip Interconnect Networks with Implicit Sequence Numbers
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

As AI models outpace the capabilities of single processors, interconnects across chips have become a critical enabler for scalable computing. These processors exchange massive amounts of data at cache-line granularity, prompting the adoption of new interconnect protocols like CXL, NVLink, and UALink, designed for high bandwidth and small payloads. However, the increasing transfer rates of these protocols heighten susceptibility to errors. While mechanisms like Cyclic Redundancy Check (CRC) and Forward Error Correction (FEC) are standard for reliable data transmission, scaling chip interconnects to multi-node configurations introduces new challenges, particularly in managing silently dropped flits in switching devices. This paper introduces Implicit Sequence Number (ISN), a novel mechanism that ensures precise flit drop detection and in-order delivery without adding header overhead. Additionally, we propose Reliability Extended Link (RXL), an extension of CXL that incorporates ISN to support scalable, reliable multi-node interconnects while maintaining compatibility with the existing flit structure. By elevating CRC to a transport-layer mechanism for end-to-end data and sequence integrity, and relying on FEC for link-layer error correction and detection, RXL delivers robust reliability and scalability without compromising bandwidth efficiency.

키워드

CRCCXLFECImplicit Sequence NumberInterconnectReliability
제목
Scaling Out Chip Interconnect Networks with Implicit Sequence Numbers
저자
Jung, GiyongGorgin, SaeidKim, JohnKim, Jungrae
DOI
10.1145/3712285.3759843
발행일
2025-12
유형
Proceedings Paper
저널명
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
페이지
1240 ~ 1251