상세 보기
초록
This study investigates the efficacy of distance metrics as bivariate technical indicators for enhancing machine learning-based financial return forecasting. Using a dataset of 14 commodities, metals, and equity index exchange-traded funds spanning 2010 to 2022, we compute 10 distance metrics — including Normalized Information Distance, Canberra distance, Euclidean distance, and cosine distance — between all asset pairs within rolling windows of 20, 60, 120, and 240 trading days. These distance-based features are integrated into five machine learning models (CatBoost, Decision Tree, LightGBM, Support Vector Regression, and XGBoost) and evaluated against a linear regression benchmark across 840,000 experimental configurations. Our results demonstrate that distance-based features reduce both the mean absolute error and the root mean squared error relative to models trained on the original dataset alone, with normalized information distance exhibiting the most consistent improvements across targets and temporal configurations ((Formula presented) in the majority of comparisons). Feature importance analysis using Shapley Additive Explanations and permutation importance reveals economically interpretable cross-asset dependencies, such as the linkage between crude oil and natural gas and the interconnectedness within the metals complex. These findings contribute to a practical and data-efficient framework for multivariate feature generation in financial forecasting.
키워드
- 제목
- Metric-based technical indicators for yield forecasting
- 저자
- Choi, Insu; Lim, Soyeong; Kim, Seoyeon; Choi, Yeona; Han, Subin; Kim, Woo Chang
- 발행일
- 2026-04
- 유형
- Article
- 권
- 98