Autores
Jafar A Khan, Stefan Van Aelst, Ruben H Zamar
Fecha de publicación
2007/12/1
Revista
Journal of the American Statistical Association
Volumen
102
Número
480
Páginas
1289-1299
Editor
Taylor & Francis
Descripción
In this article we consider the problem of building a linear prediction model when the number of candidate predictors is large and the data possibly contain anomalies that are difficult to visualize and clean. We want to predict the nonoutlying cases; therefore, we need a method that is simultaneously robust and scalable. We consider the stepwise least angle regression (LARS) algorithm which is computationally very efficient but sensitive to outliers. We introduce two different approaches to robustify LARS. The plug-in approach replaces the classical correlations in LARS by robust correlation estimates. The cleaning approach first transforms the data set by shrinking the outliers toward the bulk of the data (which we call multivariate Winsorization) and then applies LARS to the transformed data. We show that the plug-in approach is time-efficient and scalable and that the bootstrap can be used to stabilize its results. We …
Citas totales
2006200720082009201020112012201320142015201620172018201920202021337716910141714191816201413
Artículos de Google Académico
JA Khan, S Van Aelst, RH Zamar - Journal of the American Statistical Association, 2007