상세 보기
- Jeon, Yongjun;
- Shin, Jongmin;
- Park, Seonmin;
- Kim, Bogeun;
- Park, Kanggil;
- ... Jung, Kyu-Hwan;
- 외 1명
WEB OF SCIENCE
0SCOPUS
0초록
Accurately recognizing surgical action triplets in surgical videos is crucial for advancing context-aware systems that deliver real-time feedback, enhancing surgical safety and efficiency. However, recognizing surgical action triplets ⟨instrument, verb, target⟩ is challenging due to subtle variations, complex interdependencies, and severe class imbalance. Most existing approaches focus on individual triplet components while overlooking their interdependencies and the inherent class imbalance in triplet distributions. To address these challenges, we propose a novel framework, Curriculum Contrastive learning with feature Mixup (CurConMix). During pre-training, we employ curriculum contrastive learning, which progressively captures relationships among triplet components and distinguishes fine-grained variations through hard pair sampling and synthetic hard negative generation. In the fine-tuning stage, we further refine the model using self-distillation and mixup strategies to alleviate class imbalance. We evaluate our framework on the CholecT45 dataset using 5-fold cross-validation. Experimental results demonstrate that our approach surpasses existing methods across various model sizes and input resolutions. Moreover, our findings underscore the importance of capturing interdependency among triplet components, highlighting the effectiveness of our proposed framework in addressing key challenges in surgical action recognition. The official implementation is available at https://github.com/MIDAS-SurgAI/CurConMix.
키워드
- 제목
- CurConMix: A Curriculum Contrastive Learning Framework for Enhancing Surgical Action Triplet Recognition
- 저자
- Jeon, Yongjun; Shin, Jongmin; Park, Seonmin; Kim, Bogeun; Park, Kanggil; Oh, Namkee; Jung, Kyu-Hwan
- 발행일
- 2025-09
- 유형
- Proceedings Paper
- 권
- 15968 LNCS
- 페이지
- 149 ~ 158