Temporal Distance-aware Subgoal Generation for Offline Hierarchical Reinforcement Learning
  • Park, Taegeon
  • Baek, Seungho
  • Park, Jongchan
  • Oh, Seungjun
  • Kim, Yusung
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Efficient subgoal generation is essential in offline Hierarchical Re- inforcement Learning (HRL) for tackling long-horizon and sparse- reward tasks. Existing approaches often struggle with redundant and inefficient subgoal candidates and fail to maintain meaningful temporal relationships due to fixed-step subgoal sampling. To ad- dress these issues, we propose Temporal Distance-Aware Subgoal Generation (TDSG), a novel framework leveraging pre-trained Tem- poral Distance (TD) representations. TDSG identifies a compact set of anchor states in the TD representation space. These states, evenly spaced at consistent temporal distance intervals and collectively covering all states in the dataset while comprising less than 1% of the entire dataset, serve as the training targets for subgoal gener- ation. This ensures efficient and temporally consistent high-level policy learning. Furthermore, the low-level policy leverages intrin- sic rewards derived from the alignment between current states and subgoals in the TD representation space, enabling effective learning even under sparse-reward conditions. Experimental results demon- strate that TDSG achieves consistent performance improvement over prior offline HRL methods across numeric and visual environ- ments. Our code is available at https://github.com/Ptaegeon/TDSG.git.

키워드

offline hierarchical reinforcement learningsubgoal generation
제목
Temporal Distance-aware Subgoal Generation for Offline Hierarchical Reinforcement Learning
저자
Park, TaegeonBaek, SeunghoPark, JongchanOh, SeungjunKim, Yusung
DOI
10.1145/3746252.3761326
발행일
2025
유형
Proceedings Paper
저널명
CIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
페이지
2305 ~ 2314