Embodied AI-Enhanced Vehicular Networks: An Integrated Vision Language Models and Reinforcement Learning Method
  • Zhang, Ruichen
  • Zhao, Changyuan
  • Du, Hongyang
  • Niyato, Dusit
  • Wang, Jiacheng
  • 외 3명
Citations

WEB OF SCIENCE

22
Citations

SCOPUS

24

초록

This paper investigates adaptive transmission strategies in embodied AI-enhanced vehicular networks by integrating vision language models (VLMs) for semantic information extraction and deep reinforcement learning (DRL) for decision-making. The proposed framework aims to optimize both data transmission efficiency and decision accuracy by formulating an optimization problem that incorporates the Weber-Fechner law, serving as a metric for balancing bandwidth utilization and quality of experience (QoE). Specifically, we employ the large language and vision assistant (LLAVA) model to extract critical semantic information from raw image data captured by embodied AI agents (i.e., vehicles), reducing transmission data size by approximately more than 90% while retaining essential content for vehicular communication and decision-making. In the dynamic vehicular environment, we employ a generalized advantage estimation-based proximal policy optimization (GAE-PPO) method to stabilize decision-making under uncertainty. Simulation results show that attention maps from LLAVA highlight the model's focus on relevant image regions, enhancing semantic representation accuracy. Additionally, our proposed transmission strategy improves QoE by up to 36% compared to DDPG and accelerates convergence by reducing required steps by up to 47% compared to pure PPO. Further analysis indicates that adapting semantic symbol length provides an effective trade-off between transmission quality and bandwidth, achieving up to a 61.4% improvement in QoE when scaling from 4 to 8 vehicles. © 2025 IEEE.

키워드

Embodied AILLAVALLMPPOQoEvehicular networksVLM
제목
Embodied AI-Enhanced Vehicular Networks: An Integrated Vision Language Models and Reinforcement Learning Method
저자
Zhang, RuichenZhao, ChangyuanDu, HongyangNiyato, DusitWang, JiachengSawadsitang, SuttineeShen, XueminKim, Dong In
DOI
10.1109/TMC.2025.3582864
발행일
2025-11
유형
Article
저널명
IEEE Transactions on Mobile Computing
24
11
페이지
1 ~ 16