Overparametrized linear regression with noisy and missing data

Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

In this paper, we investigate the performance of overparametrized linear regression models in scenarios involving noisy observations, missing data at random, and response outliers. We propose a novel Corrected Minimum Norm Interpolation (C-MNI) estimator to handle errors-in-variables (EIV) and missing datasets. Additionally, we develop a Robust Minimum Norm Interpolation (Robust MNI) estimator to effectively address response contamination by outliers. Our work successfully extends the arguments of Bartlett et al. (2020) to accommodate cases where observed data are imperfect, resulting in discrepancies between observed and true values. The main contribution of this study is the derivation of theoretical risk bounds for our proposed estimators, which notably generalize and extend the bound established by Bartlett et al. (2020). Our theoretical analysis is empirically validated through extensive simulation analyses across various realistic settings, which confirm the superior predictive accuracy and robustness of our proposed methods compared to conventional estimators.

키워드

OverfittingOverparametrized modelErrors in variablesMissing at randomOutlier contaminationNon-sparse modelsHigh dimension
제목
Overparametrized linear regression with noisy and missing data
저자
Park, SeyoungLee, Eun Ryung
DOI
10.1007/s42952-025-00352-0
발행일
2025-12-02
유형
Article; Early Access
저널명
Journal of the Korean Statistical Society