Robust Variable Selection for Single-Index Varying-Coefficient Model with Missing Data in Covariates
Abstract
:1. Introduction
- For the case of missing covariates, we propose a robust variable selection approach based on exponential squared loss and adopt the IPW method to eliminate the latent bias due to the missing values in covariates.
- We consider parametric and nonparametric methods to estimate the probabilistic model and propose a objective function with a weighted penalty for variable selection.
- We also examine how to select the parameters of the squared exponential loss function to ensure that the corresponding estimator is robust.
2. Methodology
2.1. Basis Function Expansion
2.2. Robust Estimation Based on Inverse Probability Weighting
2.3. The Penalized Robust Regression Estimator
2.4. Algorithm
2.5. The Choice of the Regularization Parameter and
2.6. The Choice of the Regularization Parameter
3. Simulation
4. Discussion
Author Contributions
Funding
Conflicts of Interest
References
- Yates, F. The analysis of replicated experiments when the field results are incomplete. Emp. J. Exp. Agric. 1933, 1, 129–142. [Google Scholar]
- Healy, M.; Westmacott, M. Missing values in experiments analysed on automatic computers. J. R. Stat. Soc. Ser. B Methodol. 1956, 5, 203–206. [Google Scholar] [CrossRef]
- Horvitz, D.G.; Thompson, D.J. A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 1952, 47, 663–685. [Google Scholar] [CrossRef]
- Robins, J.M.; Rotnitzky, A.; Zhao, L.P. Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 1994, 89, 846–866. [Google Scholar] [CrossRef]
- Wang, C.; Wang, S.; Zhao, L.P.; Ou, S.T. Weighted semiparametric estimation in regression analysis with missing covariate data. J. Am. Stat. Assoc. 1997, 92, 512–525. [Google Scholar] [CrossRef]
- Little, R.J.; Rubin, D.B. Statistical Analysis with Missing Data; John Wiley & Sons: Hoboken, NJ, USA, 2019; Volume 793. [Google Scholar]
- Liang, H.; Wang, S.; Robins, J.M.; Carroll, R.J. Estimation in partially linear models with missing covariates. J. Am. Stat. Assoc. 2004, 99, 357–367. [Google Scholar] [CrossRef] [Green Version]
- Tsiatis, A.A. Semiparametric Theory and Missing Data; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Friedman, J.; Hastie, T.; Tibshirani, R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 2000, 28, 337–407. [Google Scholar] [CrossRef]
- Wang, X.; Jiang, Y.; Huang, M.; Zhang, H. Robust variable selection with exponential squared loss. J. Am. Stat. Assoc. 2013, 108, 632–643. [Google Scholar] [CrossRef]
- Feng, S.; Xue, L. Variable selection for single-index varying-coefficient model. Front. Math China 2013, 8, 541–565. [Google Scholar] [CrossRef]
- Xue, L.; Pang, Z. Statistical inference for a single-index varying-coefficient model. Stat. Comput. 2013, 23, 589–599. [Google Scholar] [CrossRef]
- Hardle, W.; Hall, P.; Ichimura, H. Optimal smoothing in single-index models. Ann. Stat. 1993, 21, 157–178. [Google Scholar] [CrossRef]
- Wu, T.Z.; Lin, H.; Yu, Y. Single-index coefficient models for nonlinear time series. J. Nonparametr. Stat. 2011, 23, 37–58. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R. Varying-coefficient models. J. R. Stat. Soc. Ser. B Methodol. 1993, 55, 757–779. [Google Scholar] [CrossRef]
- Fan, J.; Zhang, W. Statistical estimation in varying coefficient models. Ann. Stat. 1999, 27, 1491–1518. [Google Scholar] [CrossRef]
- Xia, Y.; Li, W.K. On single-index coefficient regression models. J. Am. Stat. Assoc. 1999, 94, 1275–1285. [Google Scholar] [CrossRef]
- Xue, L.; Wang, Q. Empirical likelihood for single-index varying-coefficient models. Bernoulli 2012, 18, 836–856. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef] [Green Version]
- Peng, H.; Huang, T. Penalized least squares for single index models. J. Stat. Plan. Inference 2011, 141, 1362–1379. [Google Scholar] [CrossRef]
- Yang, H.; Yang, J. A robust and efficient estimation and variable selection method for partially linear single-index models. J. Multivar. Anal. 2014, 129, 227–242. [Google Scholar] [CrossRef]
- Wang, D.; Kulasekera, K. Parametric component detection and variable selection in varying-coefficient partially linear models. J. Multivar. Anal. 2012, 112, 117–129. [Google Scholar] [CrossRef] [Green Version]
- Yang, J.; Yang, H. Quantile regression and variable selection for single-index varying-coefficient models. Commun. Stat.-Simul. C 2017, 46, 4637–4653. [Google Scholar] [CrossRef]
- Yu, Y.; Ruppert, D. Penalized spline estimation for partially linear single-index models. J. Am. Stat. Assoc. 2002, 97, 1042–1054. [Google Scholar] [CrossRef]
- He, X.; Zhu, Z.Y.; Fung, W.K. Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 2002, 89, 579–590. [Google Scholar] [CrossRef]
- Zhao, P.; Xue, L. Variable selection for semiparametric varying coefficient partially linear models. Stat. Probab. Lett. 2009, 79, 2148–2157. [Google Scholar] [CrossRef]
n | Method | MAD | |||||||
---|---|---|---|---|---|---|---|---|---|
AD | SD | AD | SD | AD | SD | ||||
50 | ESL | 0.0862 | 0.0842 | 0.0934 | 0.0940 | 0.0916 | 0.0928 | 0.0923 | |
0.0924 | 0.0908 | 0.0945 | 0.0966 | 0.0928 | 0.0936 | 0.09379 | |||
EE | 0.6386 | 0.6812 | 0.6806 | 0.7013 | 0.7047 | 0.7014 | 0.6734 | ||
EL | 0.6418 | 0.6826 | 0.6814 | 0.702 | 0.7069 | 0.7022 | 0.6742 | ||
200 | ESL | 0.0521 | 0.0517 | 0.0568 | 0.0655 | 0.0656 | 0.0637 | 0.0623 | |
0.0643 | 0.0635 | 0.0696 | 0.0743 | 0.0704 | 0.0725 | 0.0739 | |||
EE | 0.4993 | 0.5014 | 0.4884 | 0.5113 | 0.5025 | 0.5204 | 0.4997 | ||
EL | 0.4998 | 0.5022 | 0.4892 | 0.5121 | 0.5033 | 0.5212 | 0.4999 | ||
400 | ESL | 0.0467 | 0.0475 | 0.0482 | 0.0499 | 0.0491 | 0.0493 | 0.0478 | |
0.0473 | 0.0489 | 0.0495 | 0.0504 | 0.0493 | 0.0497 | 0.0484 | |||
EE | 0.4306 | 0.4682 | 0.4673 | 0.4809 | 0.4835 | 0.4824 | 0.4531 | ||
EL | 0.4312 | 0.4694 | 0.4682 | 0.4816 | 0.4847 | 0.4830 | 0.4538 | ||
50 | ESL | 1.8642 | 1.9342 | 1.9575 | 2.0416 | 1.9464 | 1.9488 | 1.9240 | |
2.0468 | 2.1726 | 2.1934 | 2.2682 | 2.2610 | 2.3208 | 2.2627 | |||
EE | 4.2203 | 4.8418 | 5.7262 | 5.9258 | 5.4436 | 6.0240 | 5.1639 | ||
EL | 4.2217 | 4.8446 | 5.7280 | 5.9264 | 5.4475 | 6.0264 | 5.1653 | ||
200 | ESL | 0.4734 | 0.4892 | 0.4957 | 0.5044 | 0.4936 | 0.4978 | 0.4846 | |
0.4902 | 0.5013 | 0.5075 | 0.5184 | 0.5118 | 0.5250 | 0.5129 | |||
EE | 2.3115 | 2.7526 | 3.2385 | 3.5381 | 3.1327 | 3.5504 | 2.9215 | ||
EL | 2.3121 | 2.7532 | 3.2390 | 3.5388 | 3.1331 | 3.5508 | 2.9217 | ||
400 | ESL | 0.0643 | 0.0635 | 0.0696 | 0.0743 | 0.0704 | 0.0725 | 0.0739 | |
0.0713 | 0.0762 | 0.0728 | 0.0801 | 0.0697 | 0.0755 | 0.0784 | |||
EE | 1.4832 | 1.8364 | 2.3734 | 2.4119 | 2.3658 | 2.5706 | 2.1897 | ||
EL | 1.4838 | 1.8370 | 2.3742 | 2.4126 | 2.3664 | 2.5712 | 2.1903 | ||
50 | ESL | 2.2328 | 2.3444 | 2.3727 | 2.4228 | 2.4053 | 2.4264 | 2.4306 | |
2.7892 | 2.8526 | 2.8913 | 2.9726 | 2.9436 | 3.1175 | 2.8328 | |||
EE | 4.6206 | 5.2304 | 5.8304 | 7.4631 | 5.6529 | 6.4336 | 5.9205 | ||
EL | 4.6224 | 5.2120 | 5.8316 | 7.4655 | 5.6542 | 6.4368 | 5.9217 | ||
200 | ESL | 0.5036 | 0.5158 | 0.5525 | 0.5844 | 0.5534 | 0.5812 | 0.5406 | |
0.5142 | 0.5276 | 0.5697 | 0.5982 | 0.5680 | 0.5903 | 0.5534 | |||
EE | 2.5612 | 3.0546 | 3.5274 | 4.9437 | 3.4935 | 4.1178 | 3.6893 | ||
EL | 2.5616 | 3.0550 | 3.5278 | 4.9443 | 3.4940 | 4.1182 | 3.6899 | ||
400 | ESL | 0.0565 | 0.0547 | 0.0585 | 0.0657 | 0.0646 | 0.0621 | 0.0633 | |
0.0720 | 0.0755 | 0.0718 | 0.0762 | 0.0722 | 0.0719 | 0.0743 | |||
EE | 1.5764 | 1.8035 | 2.4832 | 2.7761 | 2.5167 | 2.6839 | 2.3106 | ||
EL | 1.5772 | 1.8043 | 2.4838 | 2.7767 | 2.5171 | 2.6842 | 2.3110 |
n | Method | ||||
---|---|---|---|---|---|
RASE | RASE | RASE | |||
50 | ESL | 0.3647 | 0.2281 | 0.3872 | |
0.3893 | 0.2463 | 0.3969 | |||
EE | 0.3854 | 0.2358 | 0.3905 | ||
EL | 0.3872 | 0.2364 | 0.3916 | ||
200 | ESL | 0.0941 | 0.0915 | 0.1030 | |
0.1083 | 0.1041 | 0.1152 | |||
EE | 0.1034 | 0.0967 | 0.1096 | ||
EL | 0.1038 | 0.0969 | 0.1098 | ||
400 | ESL | 0.0323 | 0.0304 | 0.0357 | |
0.0447 | 0.0428 | 0.0483 | |||
EE | 0.0333 | 0.0314 | 0.0446 | ||
EL | 0.0339 | 0.0320 | 0.0448 | ||
50 | ESL | 0.4028 | 0.3884 | 0.3916 | |
0.4264 | 0.4152 | 0.4237 | |||
EE | 1.6802 | 1.4716 | 1.7231 | ||
EL | 1.6826 | 1.4738 | 1.7242 | ||
200 | ESL | 0.1052 | 0.1056 | 0.1104 | |
0.1188 | 0.1170 | 0.1219 | |||
EE | 0.7308 | 0.6131 | 0.7463 | ||
EL | 0.7314 | 0.6138 | 0.7469 | ||
400 | ESL | 0.0366 | 0.0340 | 0.0485 | |
0.0492 | 0.0476 | 0.0513 | |||
EE | 0.4120 | 0.3493 | 0.5042 | ||
EL | 0.4124 | 0.3495 | 0.5048 | ||
50 | ESL | 0.4356 | 0.3751 | 0.3938 | |
0.4682 | 0.4065 | 0.4175 | |||
EE | 1.5145 | 1.4127 | 1.6127 | ||
EL | 1.5163 | 1.4203 | 1.6343 | ||
200 | ESL | 0.1102 | 0.1045 | 0.1146 | |
0.1228 | 0.1137 | 0.1201 | |||
EE | 0.7089 | 0.6715 | 0.7141 | ||
EL | 0.7093 | 0.6719 | 0.7147 | ||
400 | ESL | 0.0379 | 0.0384 | 0.0361 | |
0.0483 | 0.0496 | 0.0489 | |||
EE | 0.3747 | 0.3572 | 0.5387 | ||
EL | 0.3751 | 0.3577 | 0.5393 |
n | Method | NC | NIC | PC | ||||
---|---|---|---|---|---|---|---|---|
50 | ESL-SCAD | 4.850 | 0 | 0.948 | 0.2351 | 0.2306 | 0.2412 | |
LAD-SCAD | 4.820 | 0 | 0.942 | 0.2364 | 0.2318 | 0.2430 | ||
LSE-SCAD | 4.865 | 0 | 0.950 | 0.2336 | 0.2140 | 0.2384 | ||
EE-SCAD | 4.855 | 0 | 0.944 | 0.2340 | 0.2153 | 0.2396 | ||
200 | ESL-SCAD | 4.945 | 0 | 0.962 | 0.1143 | 0.1132 | 0.1228 | |
LAD-SCAD | 4.940 | 0 | 0.960 | 0.1187 | 0.1145 | 0.1256 | ||
LSE-SCAD | 4.955 | 0 | 0.970 | 0.1132 | 0.1065 | 0.1182 | ||
EE-SCAD | 4.950 | 0 | 0.965 | 0.1138 | 0.1071 | 0.1194 | ||
400 | ESL-SCAD | 5.000 | 0 | 1.000 | 0.0467 | 0.0432 | 0.0556 | |
LAD-SCAD | 5.000 | 0 | 1.000 | 0.0545 | 0.0526 | 0.0581 | ||
LSE-SCAD | 5.000 | 0 | 1.000 | 0.0423 | 0.0404 | 0.0536 | ||
EE-SCAD | 5.000 | 0 | 1.000 | 0.0429 | 0.0408 | 0.0540 | ||
50 | ESL-SCAD | 4.924 | 0.006 | 0.948 | 0.2262 | 0.2250 | 0.2333 | |
LAD-SCAD | 4.916 | 0.009 | 0.922 | 0.2350 | 0.2344 | 0.2475 | ||
LSE-SCAD | 3.503 | 0.178 | 0.594 | 0.9616 | 0.9826 | 0.9688 | ||
EE-SCAD | 3.524 | 0.190 | 0.589 | 0.9624 | 0.9856 | 0.9723 | ||
200 | ESL-SCAD | 4.946 | 0.003 | 0.962 | 0.1128 | 0.1117 | 0.1253 | |
LAD-SCAD | 4.930 | 0.005 | 0.950 | 0.1290 | 0.1274 | 0.1321 | ||
LSE-SCAD | 3.765 | 0.160 | 0.690 | 0.7398 | 0.7015 | 0.7547 | ||
EE-SCAD | 3.775 | 0.175 | 0.675 | 0.7412 | 0.7035 | 0.7569 | ||
400 | ESL-SCAD | 4.998 | 0 | 0.998 | 0.0466 | 0.0432 | 0.0585 | |
LAD-SCAD | 4.990 | 0 | 0.995 | 0.0598 | 0.0584 | 0.0619 | ||
LSE-SCAD | 4.215 | 0.105 | 0.750 | 0.4206 | 0.3573 | 0.5134 | ||
EE-SCAD | 4.190 | 0.110 | 0.735 | 0.4224 | 0.3597 | 0.5148 | ||
50 | ESL-SCAD | 4.895 | 0 | 0.930 | 0.2268 | 0.2150 | 0.2269 | |
LAD-SCAD | 4.880 | 0 | 0.922 | 0.2440 | 0.2312 | 0.2453 | ||
LSE-SCAD | 3.425 | 0.175 | 0.546 | 0.9367 | 0.9536 | 0.9516 | ||
EE-SCAD | 3.440 | 0.190 | 0.536 | 0.9435 | 0.9557 | 0.9535 | ||
200 | ESL-SCAD | 4.940 | 0 | 0.955 | 0.1129 | 0.1063 | 0.1147 | |
LAD-SCAD | 4.935 | 0 | 0.950 | 0.1335 | 0.1241 | 0.1305 | ||
LSE-SCAD | 3.805 | 0.155 | 0.685 | 0.7151 | 0.6803 | 0.7233 | ||
EE-SCAD | 3.815 | 0.165 | 0.660 | 0.7193 | 0.6819 | 0.7245 | ||
400 | ESL-SCAD | 4.997 | 0 | 1.000 | 0.0414 | 0.0563 | 0.0467 | |
LAD-SCAD | 4.995 | 0 | 1.000 | 0.0587 | 0.0601 | 0.0595 | ||
LSE-SCAD | 4.355 | 0.090 | 0.840 | 0.3823 | 0.3634 | 0.5467 | ||
EE-SCAD | 4.275 | 0.105 | 0.755 | 0.3851 | 0.3676 | 0.5493 |
n | |||
---|---|---|---|
50 | 0.5609 | 0.6491 | 0.7832 |
200 | 0.7326 | 0.8169 | 0.8664 |
400 | 0.9528 | 0.9868 | 1.0610 |
n | |||
---|---|---|---|
50 | 0.8702 | 0.9564 | 1.1062 |
200 | 1.1470 | 1.2235 | 1.3598 |
400 | 1.3682 | 1.4373 | 1.6476 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, Y.; Liu, Y.; Su, H. Robust Variable Selection for Single-Index Varying-Coefficient Model with Missing Data in Covariates. Mathematics 2022, 10, 2003. https://doi.org/10.3390/math10122003
Song Y, Liu Y, Su H. Robust Variable Selection for Single-Index Varying-Coefficient Model with Missing Data in Covariates. Mathematics. 2022; 10(12):2003. https://doi.org/10.3390/math10122003
Chicago/Turabian StyleSong, Yunquan, Yaqi Liu, and Hang Su. 2022. "Robust Variable Selection for Single-Index Varying-Coefficient Model with Missing Data in Covariates" Mathematics 10, no. 12: 2003. https://doi.org/10.3390/math10122003
APA StyleSong, Y., Liu, Y., & Su, H. (2022). Robust Variable Selection for Single-Index Varying-Coefficient Model with Missing Data in Covariates. Mathematics, 10(12), 2003. https://doi.org/10.3390/math10122003