Feature Selection in Machine Learning for Perovskite Materials Design and Discovery
Abstract
:1. Introduction
2. Workflow of Materials Machine Learning
3. The Structure and Features of Perovskite
3.1. Inorganic Perovskites
3.2. Hybrid Organic-Inorganic Perovskites
3.3. Double Perovskites
4. The Methods of Feature Selection
4.1. Filter
4.2. Wrapper
4.3. Embedded
5. Feature Selection in Machine Learning for Perovskite Materials
5.1. Feature Selection for Inorganic Perovskites
5.2. Feature Selection for Hybrid Organic-Inorganic Perovskites
5.3. Feature Selection for Double Perovskites
6. Conclusions and Outlook
- (1)
- The establishment and improvement of the perovskite materials database: Data is the ‘hardware’ for performing ML, and the quantity and quality of data are the keys to model performance. Compared with other fields, data in the materials field is usually characterized by small size and multiple sources. However, a sample size in a large proportion of materials research articles is less than 1000 or even less than 500. For perovskite materials, a dedicated perovskite database platform to collect data of various excellent properties and perovskite device parameters can be established and made available in a form that adheres to FAIR (findable, accessible, interoperable, and reusable) data principles;
- (2)
- Descriptor construction and sharing: To maximize the accuracy of the model and to avoid situations where the ML results contradict the domain expert knowledge, the descriptors can be constructed manually by combining the material domain knowledge. At the same time, for researchers in non-specialized fields, new descriptors can be constructed automatically by means of SISSO and symbolic regression methods. In addition, to break the professional barriers of different fields and further promote the discovery and design of materials, it is also necessary to establish an online access platform of descriptors corresponding to the database, which can make the professional people focus on doing the professional things to provide a greater possibility for the breakthrough of material properties. Taking perovskite thin film as an example [62,63,64,65,66], we encourage researchers to record more detailed process parameters for preparing high-quality thin films in manuscripts and construct a relevant database of process parameters. The key parameters affecting film quality could be selected by employing suitable feature selection methods based on the database. Then an ML model for quantitative analysis of process parameters and film quality can be constructed, offering the possibility of accelerating the optimization of process parameters and guiding the experimental synthesis of high-quality thin films;
- (3)
- Evaluation and development of feature selection methods: In the application of materials ML workflow, researchers have mostly only objectively stated which methods were used for feature selection, and then model construction and selection based on the selected feature subsets were performed. The selection of methods is, in essence, serving the current data. The input of different feature subsets is the result of different selection methods, so the evaluation and comparison of feature selection methods in conjunction with ML algorithms is also quite an important topic. The development of new feature selection methods for material data can also be considered. Based on some practical experience, the ensemble idea can be used to develop ensemble feature selection methods applicable to materials data, which can ensure the stability of feature subsets and thus have stronger generality.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef] [PubMed]
- Shehab, M.; Abualigah, L.; Shambour, Q.; Abu-Hashem, M.A.; Shambour, M.K.Y.; Alsalibi, A.I.; Gandomi, A.H. Machine learning in medical applications: A review of state-of-the-art methods. Comput. Biol. Med. 2022, 145, 105458. [Google Scholar] [CrossRef]
- Henrique, B.M.; Sobreiro, V.A.; Kimura, H. Literature review: Machine learning techniques applied to financial market prediction. Expert Syst. Appl. 2019, 124, 226–251. [Google Scholar] [CrossRef]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef] [PubMed]
- Larranaga, P.; Calvo, B.; Santana, R.; Bielza, C.; Galdiano, J.; Inza, I.; Lozano, J.A.; Armananzas, R.; Santafe, G.; Perez, A.; et al. Machine learning in bioinformatics. Brief. Bioinform. 2006, 7, 86–112. [Google Scholar] [CrossRef] [PubMed]
- Butler, K.T.; Davies, D.W.; Cartwright, H.; Isayev, O.; Walsh, A. Machine learning for molecular and materials science. Nature 2018, 559, 547–555. [Google Scholar] [CrossRef]
- Schmidt, J.; Marques, M.R.G.; Botti, S.; Marques, M.A.L. Recent advances and applications of machine learning in solid-state materials science. NPJ Comput. Mater. 2019, 5, 83. [Google Scholar] [CrossRef]
- Tao, Q.; Xu, P.; Li, M.; Lu, W. Machine learning for perovskite materials design and discovery. NPJ Comput. Mater. 2021, 7, 23. [Google Scholar] [CrossRef]
- Min, K.; Cho, E. Accelerated discovery of potential ferroelectric perovskite via active learning. J. Mater. Chem. C 2020, 8, 7866–7872. [Google Scholar] [CrossRef]
- Gok, E.C.; Yildirim, M.O.; Haris, M.P.U.; Eren, E.; Pegu, M.; Hemasiri, N.H.; Huang, P.; Kazim, S.; Uygun Oksuz, A.; Ahmad, S. Predicting Perovskite Bandgap and Solar Cell Performance with Machine Learning. Sol. RRL 2021, 6, 2100927. [Google Scholar] [CrossRef]
- Yin, W.-J.; Weng, B.; Ge, J.; Sun, Q.; Li, Z.; Yan, Y. Oxide perovskites, double perovskites and derivatives for electrocatalysis, photocatalysis, and photovoltaics. Energy Environ. Sci. 2019, 12, 442–462. [Google Scholar] [CrossRef]
- Talapatra, A.; Uberuaga, B.P.; Stanek, C.R.; Pilania, G. A Machine Learning Approach for the Prediction of Formability and Thermodynamic Stability of Single and Double Perovskite Oxides. Chem. Mater. 2021, 33, 845–858. [Google Scholar] [CrossRef]
- Xu, P.; Chang, D.; Lu, T.; Li, L.; Li, M.; Lu, W. Search for ABO3 Type Ferroelectric Perovskites with Targeted Multi-Properties by Machine Learning Strategies. J. Chem. Inf. Model. 2022, 62, 5038–5049. [Google Scholar] [CrossRef]
- Yang, X.; Li, L.; Tao, Q.; Lu, W.; Li, M. Rapid discovery of narrow bandgap oxide double perovskites using machine learning. Comput. Mater. Sci. 2021, 196, 110528. [Google Scholar] [CrossRef]
- Tao, Q.; Chang, D.; Lu, T.; Li, L.; Chen, H.; Yang, X.; Liu, X.; Li, M.; Lu, W. Multiobjective Stepwise Design Strategy-Assisted Design of High-Performance Perovskite Oxide Photocatalysts. J. Phys. Chem. C 2021, 125, 21141–21150. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, J.M.; Avdeev, M.; Shi, S.Q. Multi-Layer Feature Selection Incorporating Weighted Score-Based Expert Knowledge toward Modeling Materials with Targeted Properties. Adv. Theory Simul. 2020, 3, 1900215. [Google Scholar] [CrossRef]
- Yao, G.; Hu, X.; Wang, G. A novel ensemble feature selection method by integrating multiple ranking information combined with an SVM ensemble model for enterprise credit risk prediction in the supply chain. Expert Syst. Appl. 2022, 200, 117002. [Google Scholar] [CrossRef]
- Hira, Z.M.; Gillies, D.F. A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data. Adv. Bioinform. 2015, 2015, 198363. [Google Scholar] [CrossRef]
- Zhang, X.; Yu, L.; Yin, H.; Lai, K.K. Integrating data augmentation and hybrid feature selection for small sample credit risk assessment with high dimensionality. Comput. Oper. Res. 2022, 146, 105937. [Google Scholar] [CrossRef]
- Xu, P.; Chen, H.; Li, M.; Lu, W. New Opportunity: Machine Learning for Polymer Materials Design and Discovery. Adv. Theory Simul. 2022, 5, 2100565. [Google Scholar] [CrossRef]
- Zhou, Q.; Lu, S.; Wu, Y.; Wang, J. Property-Oriented Material Design Based on a Data-Driven Machine Learning Technique. J. Phys. Chem. Lett. 2020, 11, 3920–3927. [Google Scholar] [CrossRef] [PubMed]
- Belsky, A.; Hellenbrandt, M.; Karen, V.L.; Luksch, P. New developments in the Inorganic Crystal Structure Database (ICSD): Accessibility in support of materials research and design. Acta Crystallogr. Sect. B-Struct. Sci.Cryst. Eng. Mat. 2002, 58, 364–369. [Google Scholar] [CrossRef]
- Saal, J.E.; Kirklin, S.; Aykol, M.; Meredig, B.; Wolverton, C. Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD). JOM 2013, 65, 1501–1509. [Google Scholar] [CrossRef]
- Jain, A.; Ong, S.P.; Hautier, G.; Chen, W.; Richards, W.D.; Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 2013, 1, 011002. [Google Scholar] [CrossRef]
- Dong, Y.; Zhang, Y.; Ran, M.; Zhang, X.; Liu, S.; Yang, Y.; Hu, W.; Zheng, C.; Gao, X. Accelerated identification of high-performance catalysts for low-temperature NH3-SCR by machine learning. J. Mater. Chem. A 2021, 9, 23850–23859. [Google Scholar] [CrossRef]
- Lu, T.; Li, H.; Li, M.; Wang, S.; Lu, W. Predicting Experimental Formability of Hybrid Organic-Inorganic Perovskites via Imbalanced Learning. J. Phys. Chem. Lett. 2022, 13, 3032–3038. [Google Scholar] [CrossRef] [PubMed]
- Ouyang, R.; Curtarolo, S.; Ahmetcik, E.; Scheffler, M.; Ghiringhelli, L.M. SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys. Rev. Mater. 2018, 2, 083802. [Google Scholar] [CrossRef]
- Liu, S.; Wang, J.; Duan, Z.; Wang, K.; Zhang, W.; Guo, R.; Xie, F. Simple Structural Descriptor Obtained from Symbolic Classification for Predicting the Oxygen Vacancy Defect Formation of Perovskites. ACS Appl. Mater. Interfaces 2022, 14, 11758–11767. [Google Scholar] [CrossRef]
- Mai, J.; Lu, T.; Xu, P.; Lian, Z.; Li, M.; Lu, W. Predicting the maximum absorption wavelength of azo dyes using an interpretable machine learning strategy. Dyes Pigment. 2022, 206, 110647. [Google Scholar] [CrossRef]
- Tao, Q.; Lu, T.; Sheng, Y.; Li, L.; Lu, W.; Li, M. Machine learning aided design of perovskite oxide materials for photocatalytic water splitting. J. Energy Chem. 2021, 60, 351–359. [Google Scholar] [CrossRef]
- Lu, T.; Li, H.; Li, M.; Wang, S.; Lu, W. Inverse Design of Hybrid Organic–Inorganic Perovskites with Suitable Bandgaps via Proactive Searching Progress. ACS Omega 2022, 7, 21583–21594. [Google Scholar] [CrossRef] [PubMed]
- Yang, C.; Ren, C.; Jia, Y.; Wang, G.; Li, M.; Lu, W. A machine learning-based alloy design system to facilitate the rational design of high entropy alloys with enhanced hardness. Acta Mater. 2022, 222, 117431. [Google Scholar] [CrossRef]
- Shi, L.; Chang, D.; Ji, X.; Lu, W. Using Data Mining To Search for Perovskite Materials with Higher Specific Surface Area. J. Chem. Inf. Model. 2018, 58, 2420–2427. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Lv, Z.; Zhou, L.; Chen, X.; Chen, J.; Zhou, Y.; Roy, V.A.L.; Han, S.-T. Emerging perovskite materials for high density data storage and artificial synapses. J. Mater. Chem. C 2018, 6, 1600–1617. [Google Scholar] [CrossRef]
- Žužić, A.; Ressler, A.; Macan, J. Perovskite oxides as active materials in novel alternatives to well-known technologies: A review. Ceram. Int. 2022, 48, 27240–27261. [Google Scholar] [CrossRef]
- Tian, W.; Zhou, H.; Li, L. Hybrid Organic-Inorganic Perovskite Photodetectors. Small 2017, 13, 170210. [Google Scholar] [CrossRef]
- Zuo, T.; He, X.; Hu, P.; Jiang, H. Organic-Inorganic Hybrid Perovskite Single Crystals: Crystallization, Molecular Structures, and Bandgap Engineering. ChemNanoMat 2019, 5, 278–289. [Google Scholar] [CrossRef]
- Kumar, A.; Rana, N.K.; Rani, S.; Ghosh, D.S. Toward all-inorganic perovskite solar cells: Materials, performance, and stability. Int. J. Energy Res. 2022, 46, 14659–14695. [Google Scholar] [CrossRef]
- Liang, G.-Q.; Zhang, J. A machine learning model for screening thermodynamic stable lead-free halide double perovskites. Comput. Mater. Sci. 2022, 204, 111172. [Google Scholar] [CrossRef]
- Wang, Z.; Han, Y.; Lin, X.; Cai, J.; Wu, S.; Li, J. An Ensemble Learning Platform for the Large-Scale Exploration of New Double Perovskites. ACS Appl. Mater. Interfaces 2022, 14, 717–725. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, Q.; Qiu, M.; Hu, B. Synthesis and application of perovskite-based photocatalysts in environmental remediation: A review. J. Mol. Liq. 2021, 334, 116029. [Google Scholar] [CrossRef]
- Wang, W.; Tadé, M.O.; Shao, Z. Research progress of perovskite materials in photocatalysis- and photovoltaics-related energy conversion and environmental treatment. Chem. Soc. Rev. 2015, 44, 5371–5408. [Google Scholar] [CrossRef]
- Tai, Q.; Tang, K.-C.; Yan, F. Recent progress of inorganic perovskite solar cells. Energy Environ. Sci. 2019, 12, 2375–2405. [Google Scholar] [CrossRef]
- Liu, X.; Li, J.; Cui, X.; Wang, X.; Yang, D. Strategies for the preparation of high-performance inorganic mixed-halide perovskite solar cells. RSC Adv. 2022, 12, 32925–32948. [Google Scholar] [CrossRef]
- Bartel, C.J.; Sutton, C.; Goldsmith, B.R.; Ouyang, R.; Musgrave, C.B.; Ghiringhelli, L.M.; Scheffler, M. New tolerance factor to predict the stability of perovskite oxides and halides. Sci. Adv. 2019, 5, eaav0693. [Google Scholar] [CrossRef] [PubMed]
- Zhao, J.; Wang, X. Screening Perovskites from ABO3 Combinations Generated by Constraint Satisfaction Techniques Using Machine Learning. ACS Omega 2022, 7, 10483–10491. [Google Scholar] [CrossRef] [PubMed]
- Fu, M.; Wang, L.; Ma, T.; Wu, J.; Dai, S.; Chang, Z.; Zhang, Q.; Xu, H.; Li, X. Chemical formula input relied intelligent identification of an inorganic perovskite for solar thermochemical hydrogen production. Inorg. Chem. Front. 2021, 8, 2097–2102. [Google Scholar] [CrossRef]
- Zhai, X.; Ding, F.; Zhao, Z.; Santomauro, A.; Luo, F.; Tong, J. Predicting the formation of fractionally doped perovskite oxides by a function-confined machine learning method. Commun. Mater. 2022, 3, 42. [Google Scholar] [CrossRef]
- Villars, P. Materials Platform for Data Science. 2019. Available online: https://mpds.io/ (accessed on 10 March 2023).
- Mentel, L.M. Mendeleev—A Python Resource for Properties of Chemical Elements, Ions and Isotopes. 2014. Available online: https://github.com/lmmentel/mendeleev (accessed on 10 March 2023).
- Landrum, G. RDKit: Open Source Cheminformatics. 2012. Available online: http://www.rdkit.org/ (accessed on 10 March 2023).
- Basavarajappa, M.G.; Nazeeruddin, M.K.; Chakraborty, S. Evolution of hybrid organic–inorganic perovskite materials under external pressure. Appl. Phys. Rev. 2021, 8, 041309. [Google Scholar] [CrossRef]
- Lu, T.; Li, M.; Lu, W.; Zhang, T.-Y. Recent progress in the data-driven discovery of novel photovoltaic materials. J. Mater. Inform. 2022, 2, 7. [Google Scholar] [CrossRef]
- Zhang, S.; Lu, T.; Xu, P.; Tao, Q.; Li, M.; Lu, W. Predicting the Formability of Hybrid Organic–Inorganic Perovskites via an Interpretable Machine Learning Strategy. J. Phys. Chem. Lett. 2021, 12, 7423–7430. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Xu, W.; Zhang, R. Δ-Machine learning-driven discovery of double hybrid organic–inorganic perovskites. J. Mater. Chem. A 2022, 10, 1402–1413. [Google Scholar] [CrossRef]
- Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B.P.; Ramprasad, R.; Gubernatis, J.E.; Lookman, T. Machine learning bandgaps of double perovskites. Sci. Rep. 2016, 6, 19375. [Google Scholar] [CrossRef] [PubMed]
- Halder, A.; Ghosh, A.; Dasgupta, T.S. Machine-learning-assisted prediction of magnetic double perovskites. Phys. Rev. Mater. 2019, 3, 084418. [Google Scholar] [CrossRef]
- Nair, S.S.; Krishnia, L.; Trukhanov, A.; Thakur, P.; Thakur, A. Prospect of double perovskite over conventional perovskite in photovoltaic applications. Ceram. Int. 2022, 48, 34128–34147. [Google Scholar] [CrossRef]
- Li, L.; Tao, Q.; Xu, P.; Yang, X.; Lu, W.; Li, M. Studies on the regularity of perovskite formation via machine learning. Comput. Mater. Sci. 2021, 199, 110712. [Google Scholar] [CrossRef]
- Zhu, W.; Wang, S.; Zhang, X.; Wang, A.; Wu, C.; Hao, F. Ion Migration in Organic-Inorganic Hybrid Perovskite Solar Cells: Current Understanding and Perspectives. Small 2022, 18, 2105783. [Google Scholar] [CrossRef]
- Song, T.-B.; Chen, Q.; Zhou, H.; Jiang, C.; Wang, H.-H.; Yang, Y.; Liu, Y.; You, J.; Yang, Y. Perovskite solar cells: Film formation and properties. J. Mater. Chem. A 2015, 3, 9032–9050. [Google Scholar] [CrossRef]
- Costa, J.C.S.; Azevedo, J.; Araújo, J.P.; Santos, L.M.N.B.F.; Mendes, A. High purity and crystalline thin films of methylammonium lead iodide perovskites by a vapor deposition approach. Thin Solid Films 2018, 664, 12–18. [Google Scholar] [CrossRef]
- Saki, Z.; Byranvand, M.M.; Taghavinia, N.; Kedia, M.; Saliba, M. Solution-processed perovskite thin-films: The journey from lab- to large-scale solar cells. Energy Environ. Sci. 2021, 14, 5690–5722. [Google Scholar] [CrossRef]
- Xu, F.; Li, Y.; Yuan, B.; Zhang, Y.; Wei, H.; Wu, Y.; Cao, B. Large-area CsPbBr3 perovskite films grown with effective one-step RF-magnetron sputtering. J. Appl. Phys. 2021, 129, 245303. [Google Scholar] [CrossRef]
- Alanazi, T.I. Current spray-coating approaches to manufacture perovskite solar cells. Results Phys. 2023, 44, 106144. [Google Scholar] [CrossRef]
- Swartwout, R.; Hoerantner, M.T.; Bulović, V. Scalable Deposition Methods for Large-area Production of Perovskite Thin Films. Energy Environ. Mater. 2019, 2, 119–145. [Google Scholar] [CrossRef]
- Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
- Remeseiro, B.; Bolon-Canedo, V. A review of feature selection methods in medical applications. Comput. Biol. Med. 2019, 112, 103375. [Google Scholar] [CrossRef]
- Urbanowicz, R.J.; Meeker, M.; La Cava, W.; Olson, R.S.; Moore, J.H. Relief-based feature selection: Introduction and review. J. Biomed. Inform. 2018, 85, 189–203. [Google Scholar] [CrossRef]
- Zhang, J.; Xiong, Y.; Min, S. A new hybrid filter/wrapper algorithm for feature selection in classification. Anal. Chim. Acta 2019, 1080, 43–54. [Google Scholar] [CrossRef]
- Pudjihartono, N.; Fadason, T.; Kempa-Liehr, A.W.; O’Sullivan, J.M. A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Front. Bioinform. 2022, 2, 927312. [Google Scholar] [CrossRef]
- Saeys, Y.; Inza, I.; Larranaga, P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007, 23, 2507–2517. [Google Scholar] [CrossRef]
- Venkatesh, B.; Anuradha, J. A Review of Feature Selection and Its Methods. Cybern. Inf. Technol. 2019, 19, 3–26. [Google Scholar] [CrossRef]
- Wang, Y.H.; Zhang, Y.F.; Zhang, Y.; Gu, Z.F.; Zhang, Z.Y.; Lin, H.; Deng, K.J. Identification of adaptor proteins using the ANOVA feature selection technique. Methods 2022, 208, 42–47. [Google Scholar] [CrossRef] [PubMed]
- Biesiada, J.; Duch, W. Feature Selection for High-Dimensional Data—A Pearson Redundancy Based Filter. In Computer Recognition Systems 2; Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 242–249. [Google Scholar]
- Liu, Y.; Mu, Y.; Chen, K.; Li, Y.; Guo, J. Daily Activity Feature Selection in Smart Homes Based on Pearson Correlation Coefficient. Neural Process. Lett. 2020, 51, 1771–1787. [Google Scholar] [CrossRef]
- Edelmann, D.; Móri, T.F.; Székely, G.J. On relationships between the Pearson and the distance correlation coefficients. Stat. Probab. Lett. 2021, 169, 108960. [Google Scholar] [CrossRef]
- Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
- Reshef, D.N.; Reshef, Y.A.; Finucane, H.K.; Grossman, S.R.; McVean, G.; Turnbaugh, P.J.; Lander, E.S.; Mitzenmacher, M.; Sabeti, P.C. Detecting Novel Associations in Large Data Sets. Science 2011, 334, 1518–1524. [Google Scholar] [CrossRef]
- Bommert, A.; Welchowski, T.; Schmid, M.; Rahnenfuhrer, J. Benchmark of filter methods for feature selection in high-dimensional gene expression survival data. Brief. Bioinform. 2022, 23, bbab354. [Google Scholar] [CrossRef]
- Almaghthawi, Y.; Ahmad, I.; Alsaadi, F.E. Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection. Mathematics 2022, 10, 4745. [Google Scholar] [CrossRef]
- Xue, B.; Zhang, M.; Browne, W.N.; Yao, X. A Survey on Evolutionary Computation Approaches to Feature Selection. IEEE Trans. Evol. Comput. 2016, 20, 606–626. [Google Scholar] [CrossRef]
- Jablonka, K.M.; Ongari, D.; Moosavi, S.M.; Smit, B. Big-Data Science in Porous Materials: Materials Genomics and Machine Learning. Chem. Rev. 2020, 120, 8066–8129. [Google Scholar] [CrossRef]
- Granitto, P.M.; Furlanello, C.; Biasioli, F.; Gasperi, F. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemom. Intell. Lab. Syst. 2006, 83, 83–90. [Google Scholar] [CrossRef]
- Tsai, C.-F.; Eberle, W.; Chu, C.-Y. Genetic algorithms in feature and instance selection. Knowl. Based Syst. 2013, 39, 240–247. [Google Scholar] [CrossRef]
- Tan, F.; Fu, X.; Zhang, Y.; Bourgeois, A.G. A genetic algorithm-based method for feature subset selection. Soft Comput. 2008, 12, 111–120. [Google Scholar] [CrossRef]
- Yang, J.W.; Wang, S.L.; Chen, Y.Y.; Lu, S.K.; Yang, W.Z. Feature Subset Selection Based on the Genetic Algorithm. Adv. Mater. Res. 2013, 774, 1532–1537. [Google Scholar] [CrossRef]
- Ai, C. A Method for Cancer Genomics Feature Selection Based on LASSO-RFE. Iran. J. Sci. Technol. Trans. A Sci. 2022, 46, 731–738. [Google Scholar] [CrossRef]
- Chen, H.; Shang, Z.; Lu, W.; Li, M.; Tan, F. A Property-Driven Stepwise Design Strategy for Multiple Low-Melting Alloys via Machine Learning. Adv. Eng. Mater. 2021, 23, 2100612. [Google Scholar] [CrossRef]
- Jiménez-Cordero, A.; Morales, J.M.; Pineda, S. A novel embedded min-max approach for feature selection in nonlinear Support Vector Machine classification. Eur. J. Oper. Res. 2021, 293, 24–35. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Otchere, D.A.; Ganat, T.O.A.; Ojero, J.O.; Tackie-Otoo, B.N.; Taki, M.Y. Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. J. Pet. Sci. Eng. 2022, 208, 109244. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar]
- Priyanga, G.S.; Mattur, M.N.; Nagappan, N.; Rath, S.; Thomas, T. Prediction of nature of band gap of perovskite oxides (ABO3) using a machine learning approach. J. Mater. 2022, 8, 937–948. [Google Scholar] [CrossRef]
- Zhang, L.; Zhuang, Z.; Fang, Q.; Wang, X. Study on the Automatic Identification of ABX3 Perovskite Crystal Structure Based on the Bond-Valence Vector Sum. Materials 2022, 16, 334. [Google Scholar] [CrossRef]
- Lu, S.; Zhou, Q.; Ouyang, Y.; Guo, Y.; Li, Q.; Wang, J. Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning. Nat. Commun. 2018, 9, 3405. [Google Scholar] [CrossRef]
- Wu, Y.; Lu, S.; Ju, M.G.; Zhou, Q.; Wang, J. Accelerated design of promising mixed lead-free double halide organic-inorganic perovskites for photovoltaics using machine learning. Nanoscale 2021, 13, 12250–12259. [Google Scholar] [CrossRef]
- Cai, X.; Zhang, Y.; Shi, Z.; Chen, Y.; Xia, Y.; Yu, A.; Xu, Y.; Xie, F.; Shao, H.; Zhu, H.; et al. Discovery of Lead-Free Perovskites for High-Performance Solar Cells via Machine Learning: Ultrabroadband Absorption, Low Radiative Combination, and Enhanced Thermal Conductivities. Adv. Sci. 2022, 9, 2103648. [Google Scholar] [CrossRef]
- Gao, Z.; Zhang, H.; Mao, G.; Ren, J.; Chen, Z.; Wu, C.; Gates, I.D.; Yang, W.; Ding, X.; Yao, J. Screening for lead-free inorganic double perovskites with suitable band gaps and high stability using combined machine learning and DFT calculation. Appl. Surf. Sci. 2021, 568, 150916. [Google Scholar] [CrossRef]
- Liu, H.; Feng, J.; Dong, L. Quick screening stable double perovskite oxides for photovoltaic applications by machine learning. Ceram. Int. 2022, 48, 18074–18082. [Google Scholar] [CrossRef]
- Liu, W.; Lu, Y.; Wei, D.; Huo, X.; Huang, X.; Li, Y.; Meng, J.; Zhao, S.; Qiao, B.; Liang, Z.; et al. Screening interface passivation materials intelligently through machine learning for highly efficient perovskite solar cells. J. Mater. Chem. A 2022, 10, 17782–17789. [Google Scholar] [CrossRef]
- She, C.; Huang, Q.; Chen, C.; Jiang, Y.; Fan, Z.; Gao, J. Machine learning-guided search for high-efficiency perovskite solar cells with doped electron transport layers. J. Mater. Chem. A 2021, 9, 25168–25177. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, S.; Liu, X.; Chen, Y.; Su, C.; Tang, Z.; Li, Y.; Xing, G. Metal Halide Perovskite/2D Material Heterostructures: Syntheses and Applications. Small Methods 2021, 5, 2000937. [Google Scholar] [CrossRef]
- Wang, H.P.; Li, S.; Liu, X.; Shi, Z.; Fang, X.; He, J.H. Low-Dimensional Metal Halide Perovskite Photodetectors. Adv. Mater. 2021, 33, 2003309. [Google Scholar] [CrossRef]
- Misra, R.K.; Cohen, B.-E.; Iagher, L.; Etgar, L. Low-Dimensional Organic–Inorganic Halide Perovskite: Structure, Properties, and Applications. ChemSusChem 2017, 10, 3712–3721. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Zhang, Y.; Yang, W.; Liu, H.; Fang, X. 2D Perovskite Sr2Nb3O10 for High-Performance UV Photodetectors. Adv. Mater. 2020, 32, 1905443. [Google Scholar] [CrossRef]
- Li, X.; Hoffman, J.M.; Kanatzidis, M.G. The 2D Halide Perovskite Rulebook: How the Spacer Influences Everything from the Structure to Optoelectronic Device Efficiency. Chem. Rev. 2021, 121, 2230–2291. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Z.-Z.; Guo, T.-M.; Li, Z.-G.; Gao, F.-F.; Li, W.; Wei, F.; Bu, X.-H. Machine learning assisted synthetic acceleration of Ruddlesden-Popper and Dion-Jacobson 2D lead halide perovskites. Acta Mater. 2023, 245, 118638. [Google Scholar] [CrossRef]
- Lyu, R.; Moore, C.E.; Liu, T.; Yu, Y.; Wu, Y. Predictive Design Model for Low-Dimensional Organic-Inorganic Halide Perovskites Assisted by Machine Learning. J. Am. Chem. Soc. 2021, 143, 12766–12776. [Google Scholar] [CrossRef] [PubMed]
- Hu, W.; Zhang, L.; Pan, Z. Designing Two-Dimensional Halide Perovskites Based on High-Throughput Calculations and Machine Learning. ACS Appl. Mater. Interfaces 2022, 14, 21596–21604. [Google Scholar] [CrossRef]
Name | URL | Data Type |
---|---|---|
The Perovskite Database Project (PDP) | https://www.perovskitedatabase.com (accessed on 19 March 2023) | Exp. |
Open Quantum Materials Database (OQMD) | http://www.oqmd.org/ (accessed on 19 March 2023) | Comp. |
Materials Project (MP) | https://materialsproject.org/ (accessed on 19 March 2023) | Comp. |
Computational Materials Repository (CMR) | https://cmr.fysik.dtu.dk/ (accessed on 19 March 2023) | Comp. |
The Inorganic Crystal Structure Database (ICSD) | https://icsd.fiz-karlsruhe.de/index.xhtml (accessed on 19 March 2023) | Exp. |
Materials Platform for Data Science (MPDS) | https://mpds.io/#modal/menu (accessed on 19 March 2023) | Comp. and Exp. |
Automatic-FLOW for Materials Discovery (AFLOW) | http://www.aflowlib.org/ (accessed on 19 March 2023) | Comp. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Xu, P.; Ji, X.; Li, M.; Lu, W. Feature Selection in Machine Learning for Perovskite Materials Design and Discovery. Materials 2023, 16, 3134. https://doi.org/10.3390/ma16083134
Wang J, Xu P, Ji X, Li M, Lu W. Feature Selection in Machine Learning for Perovskite Materials Design and Discovery. Materials. 2023; 16(8):3134. https://doi.org/10.3390/ma16083134
Chicago/Turabian StyleWang, Junya, Pengcheng Xu, Xiaobo Ji, Minjie Li, and Wencong Lu. 2023. "Feature Selection in Machine Learning for Perovskite Materials Design and Discovery" Materials 16, no. 8: 3134. https://doi.org/10.3390/ma16083134
APA StyleWang, J., Xu, P., Ji, X., Li, M., & Lu, W. (2023). Feature Selection in Machine Learning for Perovskite Materials Design and Discovery. Materials, 16(8), 3134. https://doi.org/10.3390/ma16083134