Efficient Vision Transformer Inference via UDP for Edge-Cloud Collaboration: An Adaptive Loss Detection Approach
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Vision transformers (ViTs) deliver exceptional performance in computer vision tasks but pose significant computational challenges for edge devices. We present an efficient vision transformer inference framework (EViTIF), an edge-cloud collaborative framework that utilizes User Datagram Protocol (UDP) to achieve low-latency communication by strategically partitioning ViT models between edge and cloud environments. To mitigate UDP's inherent unreliability, we introduce the Packet Error Rate Adaptive Loss Detection Network (PALDN), which dynamically recovers lost data without requiring extensive model retraining. Our experiments, conducted on an NVIDIA Jetson Xavier NX edge device and an A100 GPU-equipped cloud server, demonstrate that EViTIF reduces inference latency by up to 57x compared with traditional TCP (Transmission Control Protocol)-based methods. Even with up to 60% packet loss, PALDN maintains accuracy degradation below 2%, outperforming existing super-resolution based recovery approaches. Moreover, EViTIF demonstrates its versatility by generalizing across different ViT variants and scaling effectively to larger datasets like ImageNet. This framework enables real-time, high-performance vision applications in edge computing by balancing computational efficiency with robustness against network imperfections.

키워드

collaborative intelligencevision transformerUser Datagram Protocol (UDP) transferedge-cloudreal-time system
제목
Efficient Vision Transformer Inference via UDP for Edge-Cloud Collaboration: An Adaptive Loss Detection Approach
저자
Kim, HyochanKo, Jong Hwan
DOI
10.1007/s11390-025-5171-z
발행일
2026-01-21
유형
Article; Early Access
저널명
Journal of Computer Science and Technology