Bayesian Inference for Finite Mixture Regression Model Based on Non-Iterative Algorithm
Abstract
:1. Introduction
2. Finite Mixture Normal Regression (FMNR) Model
3. Bayesian Inference Using IBF Algorithm
3.1. The Prior and Conditional Distributions
3.2. IBF Sampler
4. Results
4.1. Simulation Studies
- 1.
- Normal distribution: ;
- 2.
- Student’s t distribution: ;
- 3.
- Laplace distribution: ;
- 4.
- Logistic distribution:
- 1.
- All the three algorithms recover the true parameters successfully, since most of the Means based on the 200 replications are very close to the corresponding true values of parameters, and the MSEs and MADs are rather small. Besides, almost all of the coverage probabilities are around 0.95, which is the true level of confidence interval. We also see that as sample size increases from 100 to 300, the MSEs and MADs decrease and fluctuate steadily in a small range;
- 2.
- In general, for fixed size of samples, the MSEs and MADs in IBF algorithm are smaller than the ones in EM algorithm and Gibbs sampling, which show that the IBF sampling algorithm can produce more accurate estimators.
4.2. Real Data Analysis
4.3. Algorithm Selection
5. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Quandt, R.; Ramsey, J. Estimating mixtures of normal distributions and switching regression. J. Am. Stat. Assoc. 1978, 73, 730–738. [Google Scholar] [CrossRef]
- De Veaux, R.D. Mixtures of linear regressions. Comput. Stat. Data Anal. 1989, 8, 227–245. [Google Scholar] [CrossRef]
- Jones, P.N.; McLachlan, G.J. Fitting finite mixture models in a regression context. Aust. J. Stat. 1992, 34, 233–240. [Google Scholar] [CrossRef]
- Turner, T.R. Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. J. R. Stat. Soc. C-Appl. 2000, 49, 371–384. [Google Scholar] [CrossRef]
- McLachlan, J.; Peel, D. Finite Mixture Models, 1st ed.; Wiley: New York, NY, USA, 2000. [Google Scholar]
- Frühwirth-Schnatter, S. Finite Mixture and Markov Switching Models, 1st ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
- Faria, S.; Soromenho, G. Fitting mixtures of linear regressions. J. Stat. Comput. Simul. 2010, 80, 201–225. [Google Scholar] [CrossRef]
- Peel, D.; Mclachlan, G.J. Roubust mixture modelling using the t distribution. Stat. Comput. 2000, 10, 339–348. [Google Scholar] [CrossRef]
- Song, W.X.; Yao, W.X.; Xing, Y.R. Robust mixture regression model fitting by Laplace distribution. Comput. Stat. Data Anal. 2014, 71, 128–137. [Google Scholar] [CrossRef] [Green Version]
- Song, W.X.; Yao, W.X.; Xing, Y.R. Robust mixture multivariate linear regression by multivariate Laplace distribution. Stat. Probab. Lett. 2017, 43, 2162–2172. [Google Scholar]
- Zeller, C.B.; Cabral, C.R.B.; Lachos, V.H. Robust mixture regression modelling based on scale mixtures of skew-normal distributions. Test 2016, 25, 375–396. [Google Scholar] [CrossRef]
- Tian, Y.Z.; Tang, M.L.; Tian, M.Z. A class of finite mixture of quantile regressions with its applications. J. Appl. Stat. 2016, 43, 1240–1252. [Google Scholar] [CrossRef]
- Yang, F.K.; Shan, A.; Yuan, H.J. Gibbs sampling for mixture quantile regression based on asymmetric Laplace distribution. Commun. Stat. Simul. Comput. 2019, 48, 1560–1573. [Google Scholar] [CrossRef]
- Dempster, A.P.; Laird, N.M.; Robin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 1977, 39, 1–38. [Google Scholar]
- Tan, M.; Tian, G.L.; Ng, K. A non-iterative sampling algorithm for computing posteriors in the structure of em-type algorithms. Stat. Sin. 2003, 43, 2162–2172. [Google Scholar]
- Tan, M.; Tian, G.L.; Ng, K. Bayesian Missing Data Problems: EM, Data Augmentation and Noniterative Computation, 1st ed.; Chapman & Hall/CRC: New York, NY, USA, 2010. [Google Scholar]
- Yuan, H.J.; Yang, F.K. A non-iterative Bayesian sampling algorithm for censored Student-t linear regression models. J. Stat. Comput. Simul. 2016, 86, 3337–3355. [Google Scholar]
- Yang, F.K.; Yuan, H.J. A non-iterative posterior sampling algorithm for linear quantile regression model. Commun. Stat. Simul. Comput. 2017, 46, 5861–5878. [Google Scholar] [CrossRef]
- Yang, F.K.; Yuan, H.J. A Non-iterative Bayesian Sampling Algorithm for Linear Regression Models with Scale Mixtures of Normal Distributions. Comput. Econ. 2017, 45, 579–597. [Google Scholar] [CrossRef]
- Tsionas, M.G. A non-iterative (trivial) method for posterior inference in stochastic volatility models. Stat. Probab. Lett. 2017, 126, 83–87. [Google Scholar] [CrossRef] [Green Version]
- Rubin, D.B. Comments on “The calculation of posterior distributions by data augmentation,” by M.A. Tanner and W.H. Wong. J. Am. Statist. Assoc. 1987, 82, 543–546. [Google Scholar]
- Rubin, D.B. Using the SIR algorithm to simulate posterior distributions. In Bayesian Statistics; Bernardo, J.M., DeGroot, M.H., Lindley, D.V.A., Simith, A.F.M., Eds.; Oxford University Press: Oxford, UK, 1988; Volume 3, pp. 395–402. [Google Scholar]
- Cohen, E. The Influence of Nonharmonic Partials on Tone Perception. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 1980. [Google Scholar]
- Benaglia, T.; Chauveau, D.; Hunter, D.R.; Yong, D. mixtools: An R package for analyzing finite mixture models. J. Stat. Softw. 2009, 32, 1–19. [Google Scholar] [CrossRef] [Green Version]
- Viele, K.; Tong, B. Modeling with mixtures of linear regressions. Stat. Comput. 2002, 12, 315–330. [Google Scholar] [CrossRef]
- Hunter, D.R.; Young, D.S. Semiparametric Mixtures of regressions. J. Nonparametr. Stat. 2012, 24, 19–38. [Google Scholar] [CrossRef] [Green Version]
- Young, D.S. Mixtures of regressions with change points. Stat. Comput. 2014, 24, 265–281. [Google Scholar] [CrossRef]
- Wu, Q.; Yao, W.X. Mixtures of quantile regressions. Comput. Stat. Data Anal. 2016, 93, 162–176. [Google Scholar] [CrossRef] [Green Version]
- Turner, R. Mixreg: Functions to Fit Mixtures of Regressions. R Package Version 0.0-6. 2018. Available online: http://CRAN.R-project.org/package=mixreg (accessed on 30 March 2018).
- Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. B 2002, 64, 583–639. [Google Scholar] [CrossRef] [Green Version]
- Brooks, S.P. Discussion on the paper by Spiegelhalter, Best, Carlin, and van der Linde. J. R. Stat. Soc. B 2002, 64, 616–618. [Google Scholar]
- Carlin, B.P.; Louis, T.A. Bayes and Empirical Bayes Methods for Data Analysis, 2nd ed.; Chapman & Hall/CRC: Boca Raton, FL, USA, 2000. [Google Scholar]
- Green, P.J. Reversible jump markov chain monte carlo computation and Bayesian model determination. Biometrika 1995, 82, 711–732. [Google Scholar] [CrossRef]
- Richardson, S.; Green, P.J. On bayesian analysis of mixtures with an unknown number of components. J. R. Stat. Soc. B 1997, 59, 731–784. [Google Scholar] [CrossRef]
Distribution | Parameter | IBF () | IBF () | ||||||
---|---|---|---|---|---|---|---|---|---|
Mean | MSE | MAD | CP | Mean | MSE | MAD | CP | ||
Normal | 5.0105 | 0.0225 | 0.1160 | 0.93 | 4.9971 | 0.0069 | 0.0664 | 0.96 | |
−5.000 | 0.0748 | 0.2202 | 0.93 | −4.9983 | 0.0250 | 0.1228 | 0.94 | ||
−4.9940 | 0.0214 | 0.1171 | 0.965 | −5.0061 | 0.0071 | 0.0662 | 0.93 | ||
4.9949 | 0.0685 | 0.2053 | 0.940 | 5.0240 | 0.0230 | 0.1209 | 0.95 | ||
0.4937 | 0.0025 | 0.0411 | 0.965 | 0.4968 | 0.0008 | 0.0230 | 0.94 | ||
t (3) | 5.0110 | 0.0574 | 0.1884 | 0.96 | 5.0206 | 0.0184 | 0.1100 | 0.96 | |
−5.0290 | 0.2251 | 0.3545 | 0.91 | −5.0501 | 0.0648 | 0.1975 | 0.955 | ||
−4.9817 | 0.0549 | 0.1785 | 0.96 | −5.0056 | 0.0165 | 0.1048 | 0.985 | ||
5.0243 | 0.2083 | 0.3496 | 0.95 | 5.0348 | 0.0852 | 0.2284 | 0.92 | ||
0.4973 | 0.0032 | 0.0401 | 0.95 | 0.5005 | 0.0010 | 0.0262 | 0.93 | ||
Logistic (0,1) | 4.9907 | 0.0782 | 0.2215 | 0.945 | 5.0064 | 0.0327 | 0.1443 | 0.940 | |
−4.9812 | 0.2347 | 0.3890 | 0.905 | −4.9892 | 0.1026 | 0.2491 | 0.905 | ||
−5.0054 | 0.0752 | 0.2187 | 0.955 | −5.0053 | 0.0314 | 0.1428 | 0.915 | ||
4.9892 | 0.2388 | 0.3838 | 0.915 | 4.9810 | 0.0890 | 0.2308 | 0.920 | ||
0.4917 | 0.0035 | 0.0477 | 0.92 | 0.4972 | 0.0009 | 0.0248 | 0.955 | ||
Laplace (0,1) | 5.0182 | 0.0420 | 0.1631 | 0.95 | 5.0119 | 0.0138 | 0.0950 | 0.97 | |
−4.9800 | 0.1385 | 0.2799 | 0.93 | −5.0029 | 0.0540 | 0.1856 | 0.935 | ||
−5.0161 | 0.0475 | 0.1644 | 0.915 | −5.0133 | 0.0160 | 0.0986 | 0.925 | ||
4.9743 | 0.1691 | 0.3163 | 0.925 | 5.0014 | 0.0428 | 0.1705 | 0.950 | ||
0.4996 | 0.0021 | 0.0429 | 0.95 | 0.4992 | 0.0010 | 0.0231 | 0.94 |
Distribution | Parameter | Gibbs () | Gibbs () | ||||||
---|---|---|---|---|---|---|---|---|---|
Mean | MSE | MAD | CP | Mean | MSE | MAD | CP | ||
Normal | 5.0060 | 0.0228 | 0.1213 | 0.955 | 5.0031 | 0.0076 | 0.0707 | 0.96 | |
−5.0047 | 0.0871 | 0.2381 | 0.940 | −5.0024 | 0.0313 | 0.1413 | 0.93 | ||
−5.000 | 0.0268 | 0.1311 | 0.945 | −4.9943 | 0.0073 | 0.0683 | 0.95 | ||
5.0356 | 0.0810 | 0.2272 | 0.950 | 5.0079 | 0.0265 | 0.1281 | 0.94 | ||
0.4989 | 0.0024 | 0.0392 | 0.965 | 0.5008 | 0.0008 | 0.0238 | 0.955 | ||
t(3) | 4.9670 | 0.0732 | 0.1919 | 0.955 | 4.9724 | 0.0616 | 0.1909 | 0.975 | |
−5.0424 | 0.2138 | 0.3486 | 0.955 | −5.0850 | 0.1967 | 0.3215 | 0.960 | ||
−4.9766 | 0.0789 | 0.2105 | 0.965 | −4.9761 | 0.0664 | 0.1928 | 0.95 | ||
5.1234 | 0.2851 | 0.4129 | 0.955 | 5.0654 | 0.2010 | 0.3415 | 0.935 | ||
0.5001 | 0.0032 | 0.0449 | 0.92 | 0.5006 | 0.0027 | 0.0408 | 0.955 | ||
Logistic (0,1) | 4.9644 | 0.0861 | 0.2355 | 0.960 | 4.9937 | 0.0246 | 0.1264 | 0.960 | |
−5.0682 | 0.3350 | 0.4700 | 0.935 | −5.0430 | 0.0872 | 0.2344 | 0.925 | ||
−4.9775 | 0.0894 | 0.2444 | 0.96 | −4.9958 | 0.0367 | 0.1483 | 0.915 | ||
5.0036 | 0.2960 | 0.4289 | 0.955 | 4.9960 | 0.0979 | 0.2489 | 0.945 | ||
0.4983 | 0.0031 | 0.0437 | 0.93 | 0.50497 | 0.0009 | 0.0244 | 0.97 | ||
Laplace (0,1) | 5.0046 | 0.0447 | 0.1645 | 0.950 | 5.0094 | 0.0145 | 0.0953 | 0.965 | |
−5.0263 | 0.1370 | 0.2893 | 0.975 | −5.0065 | 0.0598 | 0.1921 | 0.940 | ||
−4.9849 | 0.0395 | 0.1561 | 0.975 | −5.0019 | 0.0137 | 0.0940 | 0.975 | ||
5.0549 | 0.1705 | 0.3376 | 0.950 | 4.9823 | 0.0489 | 0.1717 | 0.950 | ||
0.5050 | 0.0028 | 0.0438 | 0.96 | 0.4983 | 0.0009 | 0.0237 | 0.955 |
Distribution | Parameter | EM () | EM () | ||||||
---|---|---|---|---|---|---|---|---|---|
Mean | MSE | MAD | CP | Mean | MSE | MAD | CP | ||
Normal | 4.9921 | 0.0226 | 0.1199 | 0.930 | 5.0195 | 0.0278 | 0.1346 | 0.955 | |
−4.9789 | 0.0753 | 0.2203 | 0.935 | −4.9964 | 0.0227 | 0.1214 | 0.960 | ||
−5.0097 | 0.0243 | 0.1244 | 0.93 | −4.9959 | 0.0078 | 0.0684 | 0.935 | ||
4.9887 | 0.0690 | 0.2045 | 0.95 | 5.0009 | 0.0236 | 0.1246 | 0.960 | ||
0.5020 | 0.0024 | 0.0386 | 0.945 | 0.4984 | 0.0008 | 0.0241 | 0.955 | ||
t(3) | 4.9919 | 0.0569 | 0.1889 | 0.92 | 5.0245 | 0.0206 | 0.1184 | 0.945 | |
−4.9531 | 0.2392 | 0.3826 | 0.89 | −0.5033 | 0.0837 | 0.2219 | 0.905 | ||
−4.9900 | 0.0694 | 0.2001 | 0.915 | −4.9987 | 0.0203 | 0.1086 | 0.935 | ||
5.0178 | 0.2087 | 0.3505 | 0.920 | 5.0320 | 0.0876 | 0.2336 | 0.915 | ||
0.4970 | 0.0032 | 0.0449 | 0.925 | 0.4972 | 0.0010 | 0.0261 | 0.96 | ||
Logistic (0,1) | 5.0291 | 0.0786 | 0.2229 | 0.925 | 5.0195 | 0.0278 | 0.1346 | 0.955 | |
−4.9373 | 0.2377 | 0.3913 | 0.945 | −5.0232 | 0.0992 | 0.2514 | 0.915 | ||
−5.0034 | 0.0791 | 0.2233 | 0.925 | −5.0061 | 0.0276 | 0.1310 | 0.950 | ||
4.9162 | 0.2440 | 0.3903 | 0.940 | 4.9769 | 0.0901 | 0.2316 | 0.935 | ||
0.4962 | 0.0035 | 0.0487 | 0.95 | 0.5026 | 0.0009 | 0.0247 | 0.965 | ||
Laplace (0,1) | 4.9928 | 0.0425 | 0.1638 | 0.940 | 4.9982 | 0.0177 | 0.1076 | 0.935 | |
−4.9637 | 0.1714 | 0.3282 | 0.895 | −4.9834 | 0.0459 | 0.1749 | 0.970 | ||
−5.0185 | 0.0486 | 0.1697 | 0.895 | −5.0125 | 0.0133 | 0.0926 | 0.965 | ||
5.0064 | 0.1831 | 0.3334 | 0.925 | 4.9880 | 0.0525 | 0.1795 | 0.940 | ||
0.4987 | 0.0031 | 0.0445 | 0.95 | 0.4957 | 0.0009 | 0.0255 | 0.955 |
Algorithm | Estimates | ||||||||
---|---|---|---|---|---|---|---|---|---|
EM | Mean | 1.9163 | 0.0425 | −0.0192 | 0.9922 | 0.0021 | 0.0176 | 0.6977 | 0.3022 |
Sd | 0.0226 | 0.0102 | 0.1021 | 0.0441 | 0.0003 | 0.0041 | 0.0484 | 0.0484 | |
IBF | Mean | 1.9162 | 0.0426 | −0.0211 | 0.9930 | 0.0196 | 0.0022 | 0.6951 | 0.3048 |
Sd | 0.0228 | 0.0103 | 0.1072 | 0.0462 | 0.0048 | 0.0003 | 0.0444 | 0.0444 | |
Gibbs | Mean | 1.9162 | 0.0427 | −0.0198 | 0.9921 | 0.0202 | 0.0022 | 0.6983 | 0.3017 |
Sd | 0.0233 | 0.0105 | 0.1124 | 0.0481 | 0.0055 | 0.0003 | 0.0475 | 0.0475 |
Algorithm | DIC | AIC | BIC |
---|---|---|---|
IBF | −268.9490 | −259.5399 | −235.4548 |
Gibbs | −268.4472 | −259.2297 | −235.1446 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shan, A.; Yang, F. Bayesian Inference for Finite Mixture Regression Model Based on Non-Iterative Algorithm. Mathematics 2021, 9, 590. https://doi.org/10.3390/math9060590
Shan A, Yang F. Bayesian Inference for Finite Mixture Regression Model Based on Non-Iterative Algorithm. Mathematics. 2021; 9(6):590. https://doi.org/10.3390/math9060590
Chicago/Turabian StyleShan, Ang, and Fengkai Yang. 2021. "Bayesian Inference for Finite Mixture Regression Model Based on Non-Iterative Algorithm" Mathematics 9, no. 6: 590. https://doi.org/10.3390/math9060590
APA StyleShan, A., & Yang, F. (2021). Bayesian Inference for Finite Mixture Regression Model Based on Non-Iterative Algorithm. Mathematics, 9(6), 590. https://doi.org/10.3390/math9060590