Efficient Recurrent Optical Flow Refinement Using Mamba and Multi-Scale Loss
Citations

SCOPUS

0

초록

Optical flow estimation plays a critical role in various computer vision tasks, including video understanding and autonomous driving. Recent models such as RAFT and FlowFormer refine flow predictions iteratively using recurrent modules based on Convolutional Gated Recurrent Unit (Con-vGRU). However, ConvGRU has limitations in modeling long-range dependencies and requires a large number of parameters for decoder refinement. In this paper, we propose replacing the ConvGRU module in FlowFormer's decoder with Mamba, a state space sequence model optimized for efficient and expressive temporal modeling. Additionally, we introduce a multi-scale loss structure that incorporates low-resolution supervision to encourage global motion consistency and improve training stability. Our method maintains the original input structure of FlowFormer while improving both temporal modeling and multi-scale learning. Experiments on the KITTI benchmark show that our Mamba-based decoder achieves significant improvements over the original FlowFormer, reducing average end-point-error (AEPE) by 5.81% and F1-All by 13.41%, while also reducing decoder parameters by 32.65% and FLOPs by 22.88%. These results demonstrate that Mamba, combined with multi-scale loss, is a strong and lightweight alternative to ConvGRU for optical flow refinement.

키워드

mambamulti-scale lossoptical flow
제목
Efficient Recurrent Optical Flow Refinement Using Mamba and Multi-Scale Loss
저자
Park, MinseonShin, Jitae
DOI
10.1109/ITC-CSCC66376.2025.11137675
발행일
2025
유형
Conference Paper
저널명
2025 International Technical Conference on Circuits/Systems, Computers, and Communications, ITC-CSCC 2025