Weighted Least Squares Regression with the Best Robustness and High Computability
Abstract
:1. Introduction
2. A Class of Weighted Least Squares
2.1. A Class of Weight Functions
- P1
- is twice differentiable and . When , is asymptotically equivalent to for some positive constants and .
- P2
- If , then , where , the median of .
2.2. Weighted Least Squares Estimators
3. Properties of the
3.1. Existence
3.2. Equivariance
3.3. Robustness
4. Computation of the WLS
- (i)
- (ii)
- Step 2. For ,
- (a)
- Set , where is the minimizer of with respect to (using backtracking line search, see page 464 of [14]), or set
- (b)
- Compute , if {return }.
- (c)
- If () {break}; else set , where
end for loop. - (iii)
- Step 3. Replace by and go to step 1.
5. Examples and Comparison
5.1. Performance Criteria
- Empirical mean squared error (EMSE) For a general regression estimator , we calculate , the empirical mean squared error (EMSE) for . If is regression equivariant, then we can assume (w.l.o.g.) that the true parameter (see p.124 of [9]). Here, is the realization of obtained from the i-th sample with size n and dimension p, and replication number R is usually set to be 1000.
- Total time consumed for all replications in the simulation (TT) This criterion measures the speed of a procedure, where the faster and more accurate, the better.One possible issue is the fairness of comparison of different procedures because different programming languages (e.g., C, Rcpp, Fortran, and R) are employed by different procedures.
- Finite sample relative efficiency (FSRE) In the following, we investigate via simulation studies the finite-sample relative efficiency of the different robust alternatives of the LS with respect to the benchmark, the classical least squares line/hyperplane. The latter is optimal with normal models by the Gauss–Markov theorem. We generate samples from the linear regression model: with different sample size ns and dimension ps, where . The finite sample RE of a procedure is the percentage of EMSE of the LS divided by the EMSE of the procedure.
5.2. Examples
6. Concluding Remarks
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Proofs of Main Results
- (I)
- and is finite, or
- (II)
References
- Huber, P.J. Robust estimation of a location parameter. Ann. Math. Statist. 1964, 35, 73–101. [Google Scholar] [CrossRef]
- Rousseeuw, P.J. Least median of squares regression. J. Amer. Statist. Assoc. 1984, 79, 871–880. [Google Scholar] [CrossRef]
- Rousseeuw, P.J.; Yohai, V.J. Robust regression by means of S-estimators. In Robust and Nonlinear Time Series Analysis; Lecture Notes in Statist; Springer: New York, NY, USA, 1984; Volume 26, pp. 256–272. [Google Scholar]
- Yohai, V.J. High breakdown-point and high efficiency estimates for regression. Ann. Statist. 1987, 15, 642–656. [Google Scholar] [CrossRef]
- Yohai, V.J.; Zamar, R.H. High breakdown estimates of regression by means of the minimization of an efficient scale. J. Am. Stat. Assoc. 1988, 83, 406–413. [Google Scholar] [CrossRef]
- Rousseeuw, P.J.; Hubert, M. Regression depth (with discussion). J. Am. Stat. Assoc. 1999, 94, 388–433. [Google Scholar] [CrossRef]
- Zuo, Y. On general notions of depth for regression. Stat. Sci. 2021, 36, 142–157. [Google Scholar] [CrossRef]
- Zuo, Y.; Zuo, H. Least sum of squares of trimmed residuals regression. Electron. J. Stat. 2023, 17, 2416–2446. [Google Scholar] [CrossRef]
- Rousseeuw, P.J.; Leroy, A. Robust Regression and Outlier Detection; Wiley: New York, NY, USA, 1987. [Google Scholar]
- Maronna, R.A.; Martin, R.D.; Yohai, V.J. Robust Statistics: Theory and Methods; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
- Müller, C. Redescending M-estimators in regression analysis, cluster analysis and image analysis. Discuss. Math. Stat. 2004, 24, 59–75. [Google Scholar] [CrossRef]
- Zuo, Y. Projection-based depth functions and associated medians. Ann. Stat. 2003, 31, 1460–1490. [Google Scholar] [CrossRef]
- Donoho, D.L.; Huber, P.J. The notion of breakdown point. In A Festschrift foe Erich L. Lehmann; Bickel, P.J., Doksum, K.A., Hodges, J.L., Jr., Eds.; Wadsworth: Belmont, CA, USA, 1983; pp. 157–184. [Google Scholar]
- Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Edgar, T.F.; Himmelblau, D.M.; Lasdon, L.S. Optimization of Chemical Processes, 2nd ed.; McGraw-Hill Chemical Engineering Series; McGraw-Hill: New York, NY, USA, 2001. [Google Scholar]
- Gilbert, J.C.; Nocedal, J. Global Convergence Properties of Conjugate Gradient Methods for Optimization. Siam J. Optim. 1992, 2, 21–42. [Google Scholar] [CrossRef]
- Harrison, D.; Rubinfeld, D.L. Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 1987, 5, 81–102. [Google Scholar] [CrossRef]
- Buxton, L.H.D. The Anthropology of Cyprus. J. R. Inst. Great Br. Irel. 1920, 50, 183–235. [Google Scholar] [CrossRef]
- Hawkins, D.M.; Olive, D.J. Inconsistency of Resampling Algorithms for High Breakdown Regression Estimators and a New Algorithm, (with discussion). J. Am. Stat. Assoc. 2002, 97, 136–159. [Google Scholar] [CrossRef]
- Olive, D.J. Robust Multivariate Analysis; Springer: New York, NY, USA, 2017. [Google Scholar]
- Park, Y.; Kim, D.; Kim, S. Robust Regression Using Data Partitioning and M-Estimation. Commun. Stat. Simul. Comput. 2012, 8, 1282–1300. [Google Scholar] [CrossRef]
- Olive, D.J.; Hawkins, D.M. Practical High Breakdown Regression. 2011. Available online: http://www.math.siu.edu/olive/pphbreg.pdf (accessed on 19 March 2024).
Normal Data Sets, Each with Contamination Rate | ||||||||
---|---|---|---|---|---|---|---|---|
p | n | Method | EMSE | TT | RE | EMSE | TT | RE |
5 | 50 | mm | 0.3356 | 9.9427 | 0.9767 | 0.3357 | 9.8483 | 2.9876 |
wls | 0.3309 | 7.3604 | 0.9905 | 0.3324 | 9.4740 | 3.0178 | ||
lts | 0.3975 | 15.883 | 0.8246 | 0.3670 | 15.957 | 2.7326 | ||
ls | 0.3278 | 1.4243 | 1.0000 | 1.0030 | 1.2834 | 1.0000 | ||
5 | 50 | mm | 0.3565 | 9.8519 | 5.3673 | 8.4738 | 10.532 | 0.3311 |
wls | 0.3546 | 12.329 | 5.3951 | 0.3711 | 15.846 | 7.5618 | ||
lts | 0.6546 | 16.662 | 2.9228 | 27.223 | 17.026 | 0.1030 | ||
ls | 1.9132 | 1.3549 | 1.0000 | 2.8060 | 1.3472 | 1.0000 | ||
10 | 100 | mm | 0.2378 | 21.421 | 0.8839 | 0.2372 | 20.892 | 5.5816 |
wls | 0.2105 | 11.112 | 0.9985 | 0.2226 | 15.680 | 5.9499 | ||
lts | 0.2919 | 48.648 | 0.7201 | 0.2584 | 49.615 | 5.1245 | ||
ls | 0.2102 | 1.3298 | 1.0000 | 1.3242 | 1.2542 | 1.0000 | ||
10 | 100 | mm | 0.2410 | 20.669 | 10.244 | 5.1124 | 21.891 | 0.6979 |
wls | 0.2372 | 20.535 | 10.407 | 0.2600 | 29.146 | 13.724 | ||
lts | 0.2635 | 55.018 | 9.3714 | 40.403 | 64.803 | 0.0883 | ||
ls | 2.4691 | 1.2462 | 1.0000 | 3.5680 | 1.2626 | 1.0000 | ||
20 | 200 | mm | 0.2429 | 84.709 | 0.6564 | 0.2183 | 83.525 | 6.6713 |
wls | 0.1592 | 28.664 | 1.0021 | 0.1726 | 39.100 | 8.4390 | ||
lts | 0.2208 | 259.21 | 0.7224 | 0.2015 | 293.40 | 7.2261 | ||
ls | 0.1595 | 1.4936 | 1.0000 | 1.4564 | 1.4775 | 1.0000 | ||
20 | 200 | mm | 0.5299 | 78.387 | 5.1922 | 20.908 | 90.385 | 0.1899 |
wls | 0.1875 | 51.280 | 14.677 | 0.2126 | 71.148 | 18.672 | ||
lts | 0.1983 | 387.56 | 13.877 | 33.918 | 832.75 | 0.1170 | ||
ls | 2.7512 | 1.4566 | 1.0000 | 3.9694 | 1.4300 | 1.0000 |
Performance Measure | MM | WLS | LTS | LS |
---|---|---|---|---|
EMSE | 4.352446 × | 0.0000 | 4.619404 × | 0.0000 |
TT | 120.368098 | 161.465350 | 125.707603 | 1.487204 |
RE | 0 | NaN | 0 | NaN |
Methods | Intercept | Head | Nasal | Bigonal | Cephalic |
---|---|---|---|---|---|
hbreg | 1546.3737947 | −1.1288988 | 6.1133570 | −0.5871985 | 1.1263726 |
rmreg2 | 807.3303643 | 1.7963508 | 4.8262483 | −0.1481552 | 3.9353752 |
wls | 1437.3761729 | −1.1107210 | 5.2669763 | 0.9199388 | 0.9766958 |
lts | 1066.188018 | −1.104774 | 6.476802 | 2.523815 | 2.623706 |
lms | 449.515 | −1.061 | 7.317 | 6.215 | 4.790 |
mm | 1511.5503972 | −1.1289155 | 6.5942674 | −0.6341536 | 1.2965989 |
ls | 1546.3737947 | −1.1288988 | 6.1133570 | −0.5871985 | 1.1263726 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zuo, Y.; Zuo, H. Weighted Least Squares Regression with the Best Robustness and High Computability. Axioms 2024, 13, 295. https://doi.org/10.3390/axioms13050295
Zuo Y, Zuo H. Weighted Least Squares Regression with the Best Robustness and High Computability. Axioms. 2024; 13(5):295. https://doi.org/10.3390/axioms13050295
Chicago/Turabian StyleZuo, Yijun, and Hanwen Zuo. 2024. "Weighted Least Squares Regression with the Best Robustness and High Computability" Axioms 13, no. 5: 295. https://doi.org/10.3390/axioms13050295
APA StyleZuo, Y., & Zuo, H. (2024). Weighted Least Squares Regression with the Best Robustness and High Computability. Axioms, 13(5), 295. https://doi.org/10.3390/axioms13050295