CurConMix: A Curriculum Contrastive Learning Framework for Enhancing Surgical Action Triplet Recognition
  • Jeon, Yongjun
  • Shin, Jongmin
  • Park, Seonmin
  • Kim, Bogeun
  • Park, Kanggil
  • ... Jung, Kyu-Hwan
  • 외 1명
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Accurately recognizing surgical action triplets in surgical videos is crucial for advancing context-aware systems that deliver real-time feedback, enhancing surgical safety and efficiency. However, recognizing surgical action triplets ⟨instrument, verb, target⟩ is challenging due to subtle variations, complex interdependencies, and severe class imbalance. Most existing approaches focus on individual triplet components while overlooking their interdependencies and the inherent class imbalance in triplet distributions. To address these challenges, we propose a novel framework, Curriculum Contrastive learning with feature Mixup (CurConMix). During pre-training, we employ curriculum contrastive learning, which progressively captures relationships among triplet components and distinguishes fine-grained variations through hard pair sampling and synthetic hard negative generation. In the fine-tuning stage, we further refine the model using self-distillation and mixup strategies to alleviate class imbalance. We evaluate our framework on the CholecT45 dataset using 5-fold cross-validation. Experimental results demonstrate that our approach surpasses existing methods across various model sizes and input resolutions. Moreover, our findings underscore the importance of capturing interdependency among triplet components, highlighting the effectiveness of our proposed framework in addressing key challenges in surgical action recognition. The official implementation is available at https://github.com/MIDAS-SurgAI/CurConMix.

키워드

Class ImbalanceContrastive LearningCurriculum LearningEndoscopic SurgerySurgical Action Triplet RecognitionSurgical Video
제목
CurConMix: A Curriculum Contrastive Learning Framework for Enhancing Surgical Action Triplet Recognition
저자
Jeon, YongjunShin, JongminPark, SeonminKim, BogeunPark, KanggilOh, NamkeeJung, Kyu-Hwan
DOI
10.1007/978-3-032-05114-1_15
발행일
2025-09
유형
Proceedings Paper
저널명
Lecture Notes in Computer Science
15968 LNCS
페이지
149 ~ 158