Bias in Machine Learning: A Literature Review
Abstract
:1. Introduction
2. Materials and Methods
- IC#1: The study should contain a specific methodology, either theoretical framework or technical method/algorithm for identifying and/or mitigating bias.
- IC#2: The study takes algorithmic bias into consideration when comparing different algorithms or optimizers.
- IC#3: The study explains why a specific algorithm/optimizer/regularization method was chosen.
- EC#1: The study was not peer reviewed.
3. Data Bias
3.1. Cognitive Bias
3.2. Selection Bias
3.3. Reporting Bias
3.4. Common Mitigation Techniques for Data Bias
4. Algorithm Bias
4.1. Estimators
4.2. Optimizers
4.3. Regularization
4.4. Common Mitigation Techniques for Algorithm Bias
5. Engineer Bias
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Fazelpour, S.; Danks, D. Algorithmic bias: Senses, sources, solutions. Philos. Compass 2021, 16, e12760. [Google Scholar] [CrossRef]
- Delgado-Rodriguez, M.; Llorca, J. Bias. J. Epidemiol. Community Health 2004, 58, 635–641. [Google Scholar] [CrossRef]
- Statista—“Market Size and Revenue Comparison for Artificial Intelligence Worldwide from 2018 to 2030”. Available online: https://www.statista.com/statistics/941835/artificial-intelligence-market-size-revenue-comparisons (accessed on 15 February 2024).
- Statista—“Share of Adults in the United States Who Were Concerned about Issues Related to Artificial Intelligence (AI) as of February 2023”. Available online: https://www.statista.com/statistics/1378220/us-adults-concerns-about-artificial-intelligence-related-issues (accessed on 15 February 2024).
- Ray, P.P. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet Things Cyber-Phys. Syst. 2023, 3, 121–154. [Google Scholar] [CrossRef]
- Meyer, J.G.; Urbanowicz, R.J.; Martin, P.C.N.; O’connor, K.; Li, R.; Peng, P.-C.; Bright, T.J.; Tatonetti, N.; Won, K.J.; Gonzalez-Hernandez, G.; et al. ChatGPT and large language models in academia: Opportunities and challenges. BioData Min. 2023, 16, 20. [Google Scholar] [CrossRef] [PubMed]
- Yee, K.; Tantipongpipat, U.; Mishra, S. Image cropping on twitter: Fairness metrics, their limitations, and the importance of representation, design, and agency. In Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2), Virtual, 23 October 2021; pp. 1–24. [Google Scholar]
- Birhane, A.; Prabhu, V.U.; Whaley, J. Auditing saliency cropping algorithms. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 5 January 2022; pp. 4051–4059. [Google Scholar]
- Dressel, J.J. Accuracy and Racial Biases of Recidivism Prediction Instruments. Undergraduate Thesis, Dartmouth College, Hanover, NH, USA, 2017. [Google Scholar]
- Lin, Z.; Jung, J.; Goel, S.; Skeem, J. The limits of human predictions of recidivism. Sci. Adv. 2020, 6, eaaz0652. [Google Scholar] [CrossRef] [PubMed]
- Engel, C.; Linhardt, L.; Schubert, M. Code is law: How COMPAS affects the way the judiciary handles the risk of recidivism. Artif. Intell. Law 2024, 32, 1–22. [Google Scholar] [CrossRef]
- Roselli, D.; Matthews, J.; Talagala, N. Managing bias in AI. In Proceedings of the 2019 World Wide Web Conference, San Fransisco, CA, USA, 13 May 2019; pp. 539–544. [Google Scholar]
- Kordzadeh, N.; Ghasemaghaei, M. Algorithmic bias: Review, synthesis, and future research directions. Eur. J. Inf. Syst. 2022, 31, 388–409. [Google Scholar] [CrossRef]
- Schelter, S.; Stoyanovich, J. Taming technical bias in machine learning pipelines. Bull. Tech. Comm. Data Eng. 2020, 43, 39–50. [Google Scholar]
- Ha, T.; Kim, S. Improving Trust in AI with Mitigating Confirmation Bias: Effects of Explanation Type and Debiasing Strategy for Decision-Making with Explainable AI. Int. J. Hum.-Comput. Interact. 2023, 39, 1–12. [Google Scholar] [CrossRef]
- Kotsiantis, S.; Kanellopoulos, D.; Pintelas, P. Handling imbalanced datasets: A review. GESTS Int. Trans. Comput. Sci. Eng. 2006, 30, 25–36. [Google Scholar]
- Yen, S.; Lee, Y. Under-sampling approaches for improving prediction of the minority class in an imbalanced dataset. Lect. Notes Control Inf. Sci. 2006, 344, 731. [Google Scholar]
- Yen, S.-J.; Lee, Y.-S. Cluster-based under-sampling approaches for imbalanced data distributions. Expert Syst. Appl. 2009, 36, 5718–5727. [Google Scholar] [CrossRef]
- Tahir, M.A.; Kittler, J.; Yan, F. Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recognit. 2012, 45, 3738–3750. [Google Scholar] [CrossRef]
- Elhassan, T.; Aljurf, M. Classification of imbalance data using tomek link (t-link) combined with random under-sampling (rus) as a data reduction method. Glob. J. Technol. Opt. S 2016, 1, 100011. [Google Scholar] [CrossRef]
- Fernandez, A.; Garcia, S.; Herrera, F.; Chawla, N.V. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. J. Artif. Intell. Res. 2018, 61, 863–905. [Google Scholar] [CrossRef]
- He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Netw, (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1 June 2008; pp. 1322–1328. [Google Scholar]
- Yang, X.; Kuang, Q.; Zhang, W.; Zhang, G. AMDO: An over-sampling technique for multi-class imbalanced problems. IEEE Trans. Knowl. Data Eng. 2017, 30, 1672–1685. [Google Scholar] [CrossRef]
- Azaria, A. ChatGPT: More Human-Like Than Computer-Like, but Not Necessarily in a Good Way. In Proceedings of the 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), Atlanta, GA, USA, 6 November 2023; pp. 468–473. [Google Scholar]
- Atreides, K.; Kelley, D. Cognitive Biases in Natural Language: Automatically Detecting, Differentiating, and Measuring Bias in Text. Differentiating, and Measuring Bias in Text. 2023. Available online: https://www.researchgate.net/profile/Kyrtin-Atreides/publication/372078491_Cognitive_Biases_in_Natural_Language_Automatically_Detecting_Differentiating_and_Measuring_Bias_in_Text/links/64a3e11195bbbe0c6e0f149c/Cognitive-Biases-in-Natural-Language-Automatically-Detecting-Differentiating-and-Measuring-Bias-in-Text.pdf (accessed on 15 February 2024).
- Blawatt, K.R. Appendix A: List of cognitive biases. In Marconomics; Emerald Group Publishing Limited: Bingley, UK, 2016; pp. 325–336. [Google Scholar]
- Sayão, L.F.; Baião, F.A. An Ontology-based Data-driven Architecture for Analyzing Cognitive Biases in Decision-making. In Proceedings of the XVI Seminar on Ontology Research in Brazil (ONTOBRAS 2023) and VII Doctoral and Masters Consortium on Ontologies, (WTDO 2023), Brasilia, Brazil, 28 August–1 September 2023. [Google Scholar]
- Harris, G. Methods to Evaluate Temporal Cognitive Biases in Machine Learning Prediction Models. In Proceedings of the Companion Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 572–575. [Google Scholar]
- Liu, Q.; Jiang, H.; Pan, Z.; Han, Q.; Peng, Z.; Li, Q. BiasEye: A Bias-Aware Real-time Interactive Material Screening System for Impartial Candidate Assessment. In Proceedings of the IUI ‘24: 29th International Conference on Intelligent User Interfaces, Greenville, SC, USA, 18–21 March 2024; pp. 325–343. [Google Scholar]
- Harris, G. Mitigating cognitive biases in machine learning algorithms for decision making. In Proceedings of the Companion Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 775–781. [Google Scholar]
- Chen, X.; Sun, R.; Saluz, U.; Schiavon, S.; Geyer, P. Using causal inference to avoid fallouts in data-driven parametric analysis: A case study in the architecture, engineering, and construction industry. Dev. Built Environ. 2023, 17, 100296. [Google Scholar] [CrossRef]
- Kavitha, J.; Kiran, J.; Prasad, S.D.V.; Soma, K.; Babu, G.C.; Sivakumar, S. Prediction and Its Impact on Its Attributes While Biasing MachineLearning Training Data. In Proceedings of the 2022 Third International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE), Bengaluru, India, 16 December 2022; pp. 1–7. [Google Scholar]
- Schmidgall, S. Addressing cognitive bias in medical language models. arXiv 2024, arXiv:2402.08113. [Google Scholar]
- Bareinboim, E.; Tian, J.; Pearl, J. Recovering from selection bias in causal and statistical inference. In Probabilistic and Causal Inference: The Works of Judea Pearl; Association for Computing Machinery: New York, NY, USA, 2022; pp. 433–450. [Google Scholar]
- Tripepi, G.; Jager, K.J.; Dekker, F.W.; Zoccali, C. Selection bias and information bias in clinical research. Nephron Clin. Pract. 2010, 115, c94–c99. [Google Scholar] [CrossRef]
- Smith, L.H. Selection mechanisms and their consequences: Understanding and addressing selection bias. Curr. Epidemiol. Rep. 2020, 7, 179–189. [Google Scholar] [CrossRef]
- Mendez, M.; Maathuis, B.; Hein-Griggs, D.; Alvarado-Gamboa, L.-F. Performance evaluation of bias correction methods for climate change monthly precipitation projections over costa rica. Water 2020, 12, 482. [Google Scholar] [CrossRef]
- Heo, J.-H.; Ahn, H.; Shin, J.-Y.; Kjeldsen, T.R.; Jeong, C. Probability distributions for a quantile mapping technique for a bias correction of precipitation data: A case study to precipitation data under climate change. Water 2019, 11, 1475. [Google Scholar] [CrossRef]
- Soriano, E.; Mediero, L.; Garijo, C. Selection of bias correction methods to assess the impact of climate change on flood frequency curves. Water 2019, 11, 2266. [Google Scholar] [CrossRef]
- Kaltenpoth, D.; Vreeken, J. Identifying selection bias from observational data. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7 February 2023; Volume 37, pp. 8177–8185. [Google Scholar]
- Gharib, A.; Davies, E.G. A workflow to address pitfalls and challenges in applying machine learning models to hydrology. Adv. Water Resour. 2021, 152, 103920. [Google Scholar] [CrossRef]
- Shen, Z.; Cui, P.; Kuang, K.; Li, B.; Chen, P. Causally regularized learning with agnostic data selection bias. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Republic of Korea, 22 October 2018; pp. 411–419. [Google Scholar]
- Bibi, S.; Shin, J. Detection of Face Features using Adapted Triplet Loss with Biased data. In Proceedings of the 2022 IEEE International Conference on Imaging Systems and Techniques (IST), Virtual, 21 June 2022; pp. 1–6. [Google Scholar]
- Yang, Q.; Chen, Z.; Yuan, Y. Hierarchical bias mitigation for semi-supervised medical image classification. IEEE Trans. Med. Imaging 2023, 42, 2200–2210. [Google Scholar] [CrossRef] [PubMed]
- Wu, P.; Xu, T.; Wang, Y. Learning Personalized Treatment Rules from Electronic Health Records Using Topic Modeling Feature Extraction. In Proceedings of the 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Washington, DC, USA, 5 October 2019; pp. 392–402. [Google Scholar]
- Samadani, A.; Wang, T.; van Zon, K.; Celi, L.A. VAP risk index: Early prediction and hospital phenotyping of ventilator-associated pneumonia using machine learning. Artif. Intell. Med. 2023, 146, 102715. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Kuang, K.; Lan, L.; Wang, Z.; Huang, W.; Wu, F.; Yang, W. Out-of-distribution generalization with causal feature separation. IEEE Trans. Knowl. Data Eng. 2023, 36, 1758–1772. [Google Scholar] [CrossRef]
- Yang, Z.; Liu, Y.; Ouyang, C.; Ren, L.; Wen, W. Counterfactual can be strong in medical question and answering. Inf. Process. Manag. 2023, 60, 103408. [Google Scholar] [CrossRef]
- Costello, M.J.; Li, Y.; Zhu, Y.; Walji, A.; Sousa, S.; Remers, S.; Chorny, Y.; Rush, B.; MacKillop, J. Using conventional and machine learning propensity score methods to examine the effectiveness of 12-step group involvement following inpatient addiction treatment. Drug Alcohol. Depend. 2021, 227, 108943. [Google Scholar] [CrossRef]
- Liu, X.; Ai, W.; Li, H.; Tang, J.; Huang, G.; Feng, F.; Mei, Q. Deriving user preferences of mobile apps from their management activities. ACM Trans. Inf. Syst. 2017, 35, 1–32. [Google Scholar] [CrossRef]
- Minatel, D.; Parmezan, A.R.; Cúri, M.; Lopes, A.D.A. Fairness-Aware Model Selection Using Differential Item Functioning. In Proceedings of the 2023 International Conference on Machine Learning and Applications (ICMLA), Jacksonville, FL, USA, 15 December 2023; pp. 1971–1978. [Google Scholar]
- Dost, K.; Taskova, K.; Riddle, P.; Wicker, J. Your best guess when you know nothing: Identification and mitigation of selection bias. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17 November 2020; pp. 996–1001. [Google Scholar]
- GitHub—Imitate. Available online: https://github.com/KatDost/Imitate (accessed on 15 February 2024).
- Dost, K.; Duncanson, H.; Ziogas, I.; Riddle, P.; Wicker, J. Divide and imitate: Multi-cluster identification and mitigation of selection bias. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Chengdu, China, 11 May 2022; pp. 149–160. [Google Scholar]
- Shi, L.; Li, S.; Ding, X.; Bu, Z. Selection bias mitigation in recommender system using uninteresting items based on temporal visibility. Expert Syst. Appl. 2023, 213, 118932. [Google Scholar] [CrossRef]
- Liu, H. Rating distribution calibration for selection bias mitigation in recommendations. In Proceedings of the ACM Web Conference, Lyon, France, 25 April 2022; pp. 2048–2057. [Google Scholar]
- Liu, F.; Cole, J.; Eisenschlos, J.M.; Collier, N. Are Ever Larger Octopi Still Influenced by Reporting Biases? 2022. Available online: https://research.google/pubs/are-ever-larger-octopi-still-influenced-by-reporting-biases/ (accessed on 15 February 2024).
- Shwartz, V.; Choi, Y. Do neural language models overcome reporting bias? In Proceedings of the 28th International Conference on Computational Linguistics, Virtual, 8 December 2020; pp. 6863–6870. [Google Scholar]
- Cai, B.; Ding, X.; Chen, B.; Du, L.; Liu, T. Mitigating Reporting Bias in Semi-supervised Temporal Commonsense Inference with Probabilistic Soft Logic. Proc. AAAI Conf. Artif. Intell. 2022, 36, 10454–10462. [Google Scholar] [CrossRef]
- Wu, Q.; Zhao, M.; He, Y.; Huang, L.; Ono, J.; Wakaki, H.; Mitsufuji, Y. Towards reporting bias in visual-language datasets: Bimodal augmentation by decoupling object-attribute association. arXiv 2023, arXiv:2310.01330. [Google Scholar]
- Chiou, M.J.; Ding, H.; Yan, H.; Wang, C.; Zimmermann, R.; Feng, J. Recovering the unbiased scene graphs from the biased ones. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 17 October 2021; pp. 1581–1590. [Google Scholar]
- Misra, I.; Lawrence Zitnick, C.; Mitchell, M.; Girshick, R. Seeing through the human reporting bias: Visual classifiers from noisy human-centric labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27 June 2016; pp. 2930–2939. [Google Scholar]
- Atay, M.; Gipson, H.; Gwyn, T.; Roy, K. Evaluation of gender bias in facial recognition with traditional machine learning algorithms. In Proceedings of the 2021 IEEE Symposium Series on Computational Intelligence (SSCI), Orlando, FL, USA, 5 December 2021; pp. 1–7. [Google Scholar]
- Ayoade, G.; Chandra, S.; Khan, L.; Hamlen, K.; Thuraisingham, B. Automated threat report classification over multi-source data. In Proceedings of the 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC), Vancouver, BC, Canada, 12 November 2018; pp. 236–245. [Google Scholar]
- Vinayakumar, R.; Alazab, M.; Soman, K.P.; Poornachandran, P.; Venkatraman, S. Robust intelligent malware detection using deep learning. IEEE Access 2019, 7, 46717–46738. [Google Scholar] [CrossRef]
- Hinchliffe, C.; Rehman, R.Z.U.; Branco, D.; Jackson, D.; Ahmaniemi, T.; Guerreiro, T.; Chatterjee, M.; Manyakov, N.V.; Pandis, I.; Davies, K.; et al. Identification of Fatigue and Sleepiness in Immune and Neurodegenerative Disorders from Measures of Real-World Gait Variability. In Proceedings of the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Sydney, Australia, 24 July 2023; pp. 1–4. [Google Scholar]
- Bughin, J.; Cincera, M.; Peters, K.; Reykowska, D.; Żyszkiewicz, M.; Ohme, R. Make it or break it: On-time vaccination intent at the time of Covid-19. Vaccine 2023, 41, 2063–2072. [Google Scholar] [CrossRef] [PubMed]
- Seo, D.-C.; Han, D.-H.; Lee, S. Predicting opioid misuse at the population level is different from identifying opioid misuse in individual patients. Prev. Med. 2020, 131, 105969. [Google Scholar] [CrossRef]
- Catania, B.; Guerrini, G.; Janpih, Z. Mitigating Representation Bias in Data Transformations: A Constraint-based Optimization Approach. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15 December 2023; pp. 4127–4136. [Google Scholar]
- Hu, Q.; Rangwala, H. Metric-free individual fairness with cooperative contextual bandits. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17 November 2020; pp. 182–191. [Google Scholar]
- Rengasamy, D.; Mase, J.M.; Rothwell, B.; Figueredo, G.P. An intelligent toolkit for benchmarking data-driven aerospace prognostics. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27 October 2019; pp. 4210–4215. [Google Scholar]
- Bao, F.; Deng, Y.; Zhao, Y.; Suo, J.; Dai, Q. Bosco: Boosting corrections for genome-wide association studies with imbalanced samples. IEEE Trans. NanoBiosci. 2017, 16, 69–77. [Google Scholar] [CrossRef] [PubMed]
- Tiwari, V.; Verma, M. Prediction Of Groundwater Level Using Advance Machine Learning Techniques. In Proceedings of the 2023 3rd International Conference on Intelligent Technologies (CONIT), Hubali, India, 23 June 2023; pp. 1–6. [Google Scholar]
- Behfar, S.K. Decentralized intelligence and big data analytics reciprocal relationship. In Proceedings of the 2023 Fifth International Conference on Blockchain Computing and Applications (BCCA), Bristol, UK, 13 November 2023; pp. 643–651. [Google Scholar]
- Sepasi, S.; Etemadi, H.; Pasandidehfard, F. Designing a Model for Financial Reporting Bias. J. Account. Adv. 2021, 13, 161–189. [Google Scholar]
- Al-Sarraj, W.F.; Lubbad, H.M. Bias detection of Palestinian/Israeli conflict in western media: A sentiment analysis experimental study. In Proceedings of the 2018 International Conference on Promising Electronic Technologies (ICPET), Hyderabad, India, 28 December 2018; pp. 98–103. [Google Scholar]
- Shumway, R.H.; Stoffer, D.S.; Shumway, R.H.; Stoffer, D.S. ARIMA Models. Time Series Analysis and Its Applications: With R Examples. 2017, pp. 75–163. Available online: https://link.springer.com/book/9783031705830 (accessed on 24 February 2024).
- Salleh, M.N.M.; Talpur, N.; Hussain, K. Adaptive neuro-fuzzy inference system: Overview, strengths, limitations, and solutions. In Proceedings of the Data Mining and Big Data: Second International Conference, Fukuoka, Japan, 27 July 2017; Springer International Publishing: Berlin/Heidelberg, Germany, 2017; pp. 527–535. [Google Scholar]
- Teodorović, D. Bee colony optimization (BCO). In Innovations in Swarm Intelligence; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The performance of LSTM and BiLSTM in forecasting time series. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9 December 2019; pp. 3285–3292. [Google Scholar]
- Kramer, O. Cascade support vector machines with dimensionality reduction. In Applied Computational Intelligence and Soft Computing; Wiley Online Library: Hoboken, NJ, USA, 2015; p. 216132. [Google Scholar]
- Ruggieri, S. Efficient C4. 5 [classification algorithm]. IEEE Trans. Knowl. Data Eng. 2002, 14, 438–444. [Google Scholar] [CrossRef]
- Lewis, R.J. An introduction to classification and regression tree (CART) analysis. In Annual Meeting of the Society for Academic Emergency Medicine; Department of Emergency Medicine Harbor-UCLA Medical Center Torrance: San Francisco, CA, USA, 2002. [Google Scholar]
- Lu, W.; Li, J.; Wang, J.; Qin, L. A CNN-BiLSTM-AM method for stock price prediction. Neural Comput. Appl. 2021, 33, 4741–4753. [Google Scholar] [CrossRef]
- Wallach, H.M. Conditional Random Fields: An Introduction; CIS: East Greenbush, NY, USA, 2004. [Google Scholar]
- Mustaqeem, K.S. CLSTM: Deep feature-based speech emotion recognition using the hierarchical ConvLSTM network. Mathematics 2020, 8, 2133. [Google Scholar] [CrossRef]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
- Kumar, D.; Klefsjö, B. Proportional hazards model: A review. Reliab. Eng. Syst. Saf. 1994, 44, 177–188. [Google Scholar] [CrossRef]
- Kuhn, M.; Weston, S.; Keefer, C.; Coulter, N. Cubist Models for Regression. R Package Vignette R Package Version 0.0. 2012. 18; 480. Available online: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=fd880d2b4482fc9b383435d51f6d730c02e0be36 (accessed on 20 February 2024).
- Song, Y.Y.; Ying, L.U. Decision tree methods: Applications for classification and prediction. Shanghai Archiv. Psychiatry 2015, 27, 130. [Google Scholar]
- Canziani, A.; Paszke, A.; Culurciello, E. An analysis of deep neural network models for practical applications. arXiv 2016, arXiv:1605.07678. [Google Scholar]
- Brim, A. Deep reinforcement learning pairs trading with a double deep Q-network. In Proceedings of the 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 6 January 2020; pp. 0222–0227. [Google Scholar]
- Gardner, E.S., Jr. Exponential smoothing: The state of the art—Part II. Int. J. Forecast. 2006, 22, 637–666. [Google Scholar] [CrossRef]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13 August 2016; pp. 785–794. [Google Scholar]
- Zhong, J.; Feng, L.; Ong, Y.S. Gene expression programming: A survey. IEEE Comput. Intell. Mag. 2017, 12, 54–72. [Google Scholar] [CrossRef]
- Prettenhofer, P.; Louppe, G. Gradient boosted regression trees in scikit-learn. In Proceedings of the PyData, London, UK, 21–23 February 2014. [Google Scholar]
- Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS); IEEE: Boston, MA, USA, 2017; pp. 1597–1600. [Google Scholar]
- Guo, G.; Wang, H.; Bell, D.; Bi, Y.; Greer, K. KNN model-based approach in classification. In Proceedings of the On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Italy, 3 November 2003; pp. 986–996. [Google Scholar]
- Fan, J.; Ma, X.; Wu, L.; Zhang, F.; Yu, X.; Zeng, W. Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agric. Water Manag. 2019, 225, 105758. [Google Scholar] [CrossRef]
- Santoso, N.; Wibowo, W. Financial distress prediction using linear discriminant analysis and support vector machine. J. Phys. Conf. Ser. 2019, 979, 012089. [Google Scholar] [CrossRef]
- Su, X.; Yan, X.; Tsai, C.L. Linear regression. Wiley Interdiscip. Rev. Comput. Stat. 2012, 4, 275–294. [Google Scholar] [CrossRef]
- Joachims, T. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, 20 August 2006; pp. 217–226. [Google Scholar]
- Connelly, L. Logistic regression. Medsurg Nurs. 2020, 29, 353–354. [Google Scholar]
- Sherstinsky, A. Fundamentals of recurrent neural network (rnn) and long short-term memory (lstm) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Kruse, R.; Mostaghim, S.; Borgelt, C.; Braune, C.; Steinbrecher, M. Multi-layer perceptrons. In Computational Intelligence: A Methodological Introduction; Springer: Berlin/Heidelberg, Germany, 2022; pp. 53–124. [Google Scholar]
- Abbas, M.; Memon, K.A.; Jamali, A.A.; Memon, S.; Ahmed, A. Multinomial Naive Bayes classification model for sentiment analysis. IJCSNS Int. J. Comput. Sci. Netw. Secur. 2019, 19, 62. [Google Scholar]
- Wu, Y.-C.; Feng, J.-W. Development and application of artificial neural network. Wirel. Pers. Commun. 2018, 102, 1645–1656. [Google Scholar] [CrossRef]
- Rigatti, S.J. Random forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
- Joslin, D.E.; Clements, D.P. Squeaky wheel optimization. J. Artif. Intell. Res. 1999, 10, 353–373. [Google Scholar] [CrossRef]
- Cleveland, R.B.; Cleveland, W.S.; McRae, J.E.; Terpenning, I. STL: A seasonal-trend decomposition. J. Off. Stat. 1990, 6, 3–73. [Google Scholar]
- Wang, H.; Hu, D. Comparison of SVM and LS-SVM for regression. In Proceedings of the 2005 International Conference on Neural Netw. and Brain, Beijing, China, 13–15 October 2005; Volume 1, pp. 279–283. [Google Scholar]
- Li, X.; Lv, Z.; Wang, S.; Wei, Z.; Wu, L. A reinforcement learning model based on temporal difference algorithm. IEEE Access 2019, 7, 121922–121930. [Google Scholar] [CrossRef]
- Ramos, J. Using tf-idf to determine word relevance in document queries. In Proceedings of the First Instructional Conference on Machine Learning, Los Angeles, CA, USA, 23 June 2003; Volume 242, pp. 29–48. [Google Scholar]
- Cicirello, V.A.; Smith, S.F. Enhancing stochastic search performance by value-biased randomization of heuristics. J. Heuristics 2005, 11, 5–34. [Google Scholar] [CrossRef]
- Stock, J.H.; Watson, M.W. Vector autoregressions. J. Econ. Perspect. 2001, 15, 101–115. [Google Scholar] [CrossRef]
- Biney, J.K.M.; Vašát, R.; Bell, S.M.; Kebonye, N.M.; Klement, A.; John, K.; Borůvka, L. Prediction of topsoil organic carbon content with Sentinel-2 imagery and spectroscopic measurements under different conditions using an ensemble model approach with multiple pre-treatment combinations. Soil Tillage Res. 2022, 220, 105379. [Google Scholar] [CrossRef]
- Lihu, A.; Holban, S. Top five most promising algorithms in scheduling. In Proceedings of the 2009 5th International Symposium on Applied Computational Intelligence and Informatics, Timisoara, Romania, 28 May 2009; pp. 397–404. [Google Scholar]
- Wu, S.G.; Wang, Y.; Jiang, W.; Oyetunde, T.; Yao, R.; Zhang, X.; Shimizu, K.; Tang, Y.J.; Bao, F.S. Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming. PLoS Comput. Biol. 2016, 12, e1004838. [Google Scholar] [CrossRef] [PubMed]
- Rafay, A.; Suleman, M.; Alim, A. Robust review rating prediction model based on machine and deep learning: Yelp dataset. In Proceedings of the 2020 International Conference on Emerging Trends in Smart Technologies (ICETST), Karachi, Pakistan, 26 March 2020; pp. 8138–8143. [Google Scholar]
- Wescoat, E.; Kerner, S.; Mears, L. A comparative study of different algorithms using contrived failure data to detect robot anomalies. Procedia Comput. Sci. 2022, 200, 669–678. [Google Scholar] [CrossRef]
- Velasco-Gallego, C.; Lazakis, I. Real-time data-driven missing data imputation for short-term sensor data of marine systems. A comparative study. Ocean Eng. 2020, 218, 108261. [Google Scholar] [CrossRef]
- Merentitis, A.; Debes, C. Many hands make light work—On ensemble learning techniques for data fusion in remote sensing. IEEE Geosci. Remote Sens. Mag. 2015, 3, 86–99. [Google Scholar] [CrossRef]
- Alshboul, O.; Almasabha, G.; Shehadeh, A.; Al-Shboul, K. A comparative study of LightGBM, XGBoost, and GEP models in shear strength management of SFRC-SBWS. Structures 2024, 61, 106009. [Google Scholar] [CrossRef]
- Choubin, B.; Darabi, H.; Rahmati, O.; Sajedi-Hosseini, F.; Kløve, B. River suspended sediment modelling using the CART model: A comparative study of machine learning techniques. Sci. Total. Environ. 2018, 615, 272–281. [Google Scholar] [CrossRef] [PubMed]
- Gillfeather-Clark, T.; Horrocks, T.; Holden, E.-J.; Wedge, D. A comparative study of neural network methods for first break detection using seismic refraction data over a detrital iron ore deposit. Ore Geol. Rev. 2021, 137, 104201. [Google Scholar] [CrossRef]
- Jacob, M.; Reddy, G.S.H.; Rappai, C.; Kapoor, P.; Kolhekar, M. A Comparative Study of Supervised and Reinforcement Learning Techniques for the Application of Credit Defaulters. In Proceedings of the 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT), Bangaluru, India, 7 October 2022; pp. 1–6. [Google Scholar]
- Mavrogiorgou, A.; Kiourtis, A.; Kleftakis, S.; Mavrogiorgos, K.; Zafeiropoulos, N.; Kyriazis, D. A Catalogue of Machine Learning Algorithms for Healthcare Risk Predictions. Sensors 2022, 22, 8615. [Google Scholar] [CrossRef]
- Padhee, S.; Swygert, K.; Micir, I. Exploring Language Patterns in a Medical Licensure Exam Item Bank. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9 December 2021; pp. 503–508. [Google Scholar]
- Abdulaal, A.; Patel, A.; Charani, E.; Denny, S.; Alqahtani, S.A.; Davies, G.W.; Mughal, N.; Moore, L.S. Comparison of deep learning with regression analysis in creating predictive models for SARS-CoV-2 outcomes. BMC Med. Inform. Decis. Making 2020, 20, 299. [Google Scholar] [CrossRef] [PubMed]
- Zhao, L.; Wu, J. Performance comparison of supervised classifiers for detecting leukemia cells in high-dimensional mass cytometry data. In Proceedings of the 2017 Chinese Automation Congress (CAC), Jinan, China, 20 October 2017; pp. 3142–3146. [Google Scholar]
- Moreno-Ibarra, M.-A.; Villuendas-Rey, Y.; Lytras, M.D.; Yáñez-Márquez, C.; Salgado-Ramírez, J.-C. Classification of diseases using machine learning algorithms: A comparative study. Mathematics 2021, 9, 1817. [Google Scholar] [CrossRef]
- Venkata Durga Kiran, V.; Vinay Kumar, S.; Mudunuri, S.B.; Nookala, G.K.M. Comparative Study of Machine Learning Models to Classify Gene Variants of ClinVar. In Data Management, Analytics and Innovation, Proceedings of ICDMAI 2020; Springer: Singapore, 2020; Volume 2, pp. 2435–2443. [Google Scholar]
- Mishra, N.; Patil, V.N. Machine Learning based Improved Automatic Diagnosis of Soft Tissue Tumors (STS). In Proceedings of the 2022 International Conference on Futuristic Technologies (INCOFT), Belgaum, India, 25–27 November 2022; pp. 1–5. [Google Scholar]
- Reiter, W. Co-occurrence balanced time series classification for the semi-supervised recognition of surgical smoke. Int. J. Comput. Assist. Radiol. Surg. 2021, 16, 2021–2027. [Google Scholar] [CrossRef] [PubMed]
- Baker, M.R.; Utku, A. Unraveling user perceptions and biases: A comparative study of ML and DL models for exploring twitter sentiments towards ChatGPT. J. Eng. Res. 2023; in press. [Google Scholar] [CrossRef]
- Fergani, B. Evaluating C-SVM, CRF and LDA classification for daily activity recognition. In Proceedings of the 2012 International Conference on Multimedia Computing and Systems, Tangiers, Morocco, 10 May 2012; pp. 272–277. [Google Scholar]
- Zhang, B.H.; Lemoine, B.; Mitchell, M. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, New Orleans, LA, USA, 2 February 2018. [Google Scholar]
- Hong, J.; Zhu, Z.; Yu, S.; Wang, Z.; Dodge, H.H.; Zhou, J. Federated adversarial debiasing for fair and transferable representations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual, 14 August 2021; pp. 617–627. [Google Scholar]
- Zafar, M.B.; Valera, I.; Gomez-Rodriguez, M.; Gummadi, K.P. Fairness constraints: A flexible approach for fair classification. J. Mach. Learn. Res. 2019, 20, 1–42. [Google Scholar]
- Zafar, M.B.; Valera, I.; Rogriguez, M.G.; Gummadi, K.P. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics; PMLR: Fort Lauderdale, FL, USA, 2017; pp. 962–970. [Google Scholar]
- Feldman, M.; Friedler, S.A.; Moeller, J.; Scheidegger, C.; Venkatasubramanian, S. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10 August 2015; pp. 259–268. [Google Scholar]
- Goh, G.; Cotter, A.; Gupta, M.; Friedlander, M.P. Satisfying real-world goals with dataset constraints. Adv. Neural Inf. Process. Syst. 2016, 29, 2415–2423. [Google Scholar]
- Barocas, S.; Selbst, A.D. Big data’s disparate impact. Calif. L. Rev. 2016, 104, 671. [Google Scholar] [CrossRef]
- Creager, E.; Madras, D.; Jacobsen, J.H.; Weis, M.; Swersky, K.; Pitassi, T.; Zemel, R. Flexibly fair representation learning by disentanglement. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9 June 2019; pp. 1436–1445. [Google Scholar]
- Gupta, U.; Ferber, A.M.; Dilkina, B.; Steeg, G.V. Controllable guarantees for fair outcomes via contrastive information estimation. Proc. AAAI Conf. Artif. Intell. 2021, 35, 7610–7619. [Google Scholar] [CrossRef]
- Quadrianto, N.; Sharmanska, V.; Thomas, O. Discovering fair representations in the data domain. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15 June 2019; pp. 8227–8236. [Google Scholar]
- Mehrabi, N.; Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
- Zhou, B.-C.; Han, C.-Y.; Guo, T.-D. Convergence of stochastic gradient descent in deep neural network. Acta Math. Appl. Sin. Engl. Ser. 2021, 37, 126–136. [Google Scholar] [CrossRef]
- Joo, G.; Park, C.; Im, H. Performance evaluation of machine learning optimizers. J. IKEEE 2020, 24, 766–776. [Google Scholar]
- Si, T.N.; Van Hung, T. Hybrid Recommender Sytem Combined Sentiment Analysis with Incremental Algorithm. In Proceedings of the 2022 IEEE/ACIS 7th International Conference on Big Data, Cloud Computing, and Data Science (BCD), Danang, Vietnam, 4 August 2022; pp. 104–108. [Google Scholar]
- Qian, J.; Wu, Y.; Zhuang, B.; Wang, S.; Xiao, J. Understanding gradient clipping in incremental gradient methods. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Virtual, 13 April 2021; pp. 1504–1512. [Google Scholar]
- Mai, V.V.; Johansson, M. Stability and convergence of stochastic gradient clipping: Beyond lipschitz continuity and smoothness. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18 July 2021; pp. 7325–7335. [Google Scholar]
- Polyak, B. Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 1964, 4, 1–17. [Google Scholar] [CrossRef]
- Wilson, A.C.; Recht, B.; Jordan, M.I. A lyapunov analysis of momentum methods in optimizati. arXiv 2016, arXiv:1611.02635. [Google Scholar]
- Liu, C.; Belkin, M. Accelerating sgd with momentum for over-parameterized learning. arXiv 2018, arXiv:1810.13395. [Google Scholar]
- Nesterov, Y.E.E. A method of solving a convex programming problem with convergence rate O\bigl(k^2\bigr). In Doklady Akademii Nauk; Russian Academy of Sciences: Moscow, Russia, 1983; Volume 269, pp. 543–547. [Google Scholar]
- Gao, S.; Pei, Z.; Zhang, Y.; Li, T. Bearing fault diagnosis based on adaptive convolutional neural network with nesterov momentum. IEEE Sens. J. 2021, 21, 9268–9276. [Google Scholar] [CrossRef]
- Xie, X.; Xie, X.; Zhou, P.; Li, H.; Lin, Z.; Yan, S. Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models. arXiv 2022, arXiv:2208.06677. [Google Scholar] [CrossRef]
- GitHub—Adan. Available online: https://github.com/sail-sg/Adan (accessed on 22 March 2024).
- Guan, L. AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on Adamw Basis. In Proceedings of the ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14 April 2024; pp. 5210–5214. [Google Scholar]
- GitHub—AdaPlus. Available online: https://github.com/guanleics/AdaPlus (accessed on 22 March 2024).
- Duchi, J.; Hazan, E.; Singer, Y. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 2011, 12, 2121–2159. [Google Scholar]
- Zhang, N.; Lei, D.; Zhao, J.F. An improved Adagrad gradient descent optimization algorithm. In Proceedings of the 2018 Chinese Automation Congress (CAC), Xi’an, China, 25 June 2018; pp. 2359–2362. [Google Scholar]
- Gaiceanu, T.; Pastravanu, O. On CNN Applied to Speech-to-Text–Comparative Analysis of Different Gradient Based Optimizers. In Proceedings of the 2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI), Virtual, 19 May 2021; pp. 000085–000090. [Google Scholar]
- Zeiler, M.D. Adadelta: An adaptive learning rate method. arXiv 2012, arXiv:1212.5701. [Google Scholar]
- Sethi, B.; Goel, R. Exploring Adaptive Learning Methods for Convex Optimization. 2015. Available online: https://www.deepmusings.net/assets/AML_Project_Report.pdf (accessed on 20 February 2024).
- Guo, J.; Baharvand, A.; Tazeddinova, D.; Habibi, M.; Safarpour, H.; Roco-Videla, A.; Selmi, A. An intelligent computer method for vibration responses of the spinning multi-layer symmetric nanosystem using multi-physics modeling. Eng. Comput. 2022, 38 (Suppl. S5), 4217–4238. [Google Scholar] [CrossRef]
- Agarwal, A.K.; Kiran, V.; Jindal, R.K.; Chaudhary, D.; Tiwari, R.G. Optimized Transfer Learning for Dog Breed Classification. Int. J. Intell. Syst. Appl. Eng. 2022, 10, 18–22. [Google Scholar]
- Hinton, G.; Srivastava, N.; Swersky, K. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 2012, 14, 2. [Google Scholar]
- Huk, Μ. Stochastic optimization of contextual neural networks with RMSprop. In Intelligent Information and Database Systems: 12th Asian Conference, ACIIDS 2020, Phuket, Thailand, March 23–26, 2020, Proceedings, Part II 12; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 343–352. [Google Scholar]
- Elshamy, R.; Abu-Elnasr, O.; Elhoseny, M.; Elmougy, S. Improving the efficiency of RMSProp optimizer by utilizing Nestrove in deep learning. Sci. Rep. 2023, 13, 8814. [Google Scholar] [CrossRef] [PubMed]
- Funk, S. RMSprop Loses to SMORMS3-Beware the Epsilon! Available online: http://sifter.org/simon/journal/20150420 (accessed on 22 March 2024).
- Rossbroich, J.; Gygax, J.; Zenke, F. Fluctuation-driven initialization for spiking neural network training. Neuromorphic Comput. Eng. 2022, 2, 044016. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Llugsi, R.; El Yacoubi, S.; Fontaine, A.; Lupera, P. Comparison between Adam, AdaMax and Adam W optimizers to implement a Weather Forecast based on Neural Netw. for the Andean city of Quito. In Proceedings of the 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), Cuenca, Ecuador, 12 October 2021; pp. 1–6. [Google Scholar]
- Bellido-Jiménez, J.A.; Estévez, J.; García-Marín, A.P. New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agric. Water Manag. 2021, 245, 106558. [Google Scholar] [CrossRef]
- Rozante, J.R.; Ramirez, E.; Ramirez, D.; Rozante, G. Improved frost forecast using machine learning methods. Artif. Intell. Geosci. 2023, 4, 164–181. [Google Scholar] [CrossRef]
- Shafie, M.R.; Khosravi, H.; Farhadpour, S.; Das, S.; Ahmed, I. A cluster-based human resources analytics for predicting employee turnover using optimized Artificial Neural Network and data augmentation. Decis. Anal. J. 2024, 11, 100461. [Google Scholar] [CrossRef]
- Ampofo, K.A.; Owusu, E.; Appati, J.K. Performance Evaluation of LSTM Optimizers for Long-Term Electricity Consumption Prediction. In Proceedings of the 2022 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC), Bhubaneswar, India, 19 September 2022; pp. 1–6. [Google Scholar]
- Aguilar, D.; Riofrio, D.; Benitez, D.; Perez, N.; Moyano, R.F. Text-based CAPTCHA vulnerability assessment using a deep learning-based solver. In Proceedings of the 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), Cuenca, Ecuador, 12 October 2021; pp. 1–6. [Google Scholar]
- Indolia, S.; Nigam, S.; Singh, R. An optimized convolution neural network framework for facial expression recognition. In Proceedings of the 2021 Sixth International Conference on Image Information Processing (ICIIP), Shimla, India, 26 November 2021; Volume 6, pp. 93–98. [Google Scholar]
- Shuvo, M.M.H.; Hassan, O.; Parvin, D.; Chen, M.; Islam, S.K. An optimized hardware implementation of deep learning inference for diabetes prediction. In Proceedings of the 2021 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Virtual, 17 May 2021; pp. 1–6. [Google Scholar]
- Poorani, S.; Kalaiselvi, S.; Aarthi, N.; Agalya, S.; Malathy, N.R.; Abitha, M. Epileptic seizure detection based on hyperparameter optimization using EEG data. In Proceedings of the 2023 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 23 March 2023; pp. 890–893. [Google Scholar]
- Acharya, T.; Annamalai, A.; Chouikha, M.F. Efficacy of CNN-bidirectional LSTM hybrid model for network-based anomaly detection. In Proceedings of the 2023 IEEE 13th Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang, Malaysia, 20 May 2023; pp. 348–353. [Google Scholar]
- Mavrogiorgos, K.; Kiourtis, A.; Mavrogiorgou, A.; Gucek, A.; Menychtas, A.; Kyriazis, D. Mitigating Bias in Time Series Forecasting for Efficient Wastewater Management. In Proceedings of the 2024 7th International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, 17 July2024; pp. 185–190. [Google Scholar]
- Ying, X. An overview of overfitting and its solutions. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2019; Volume 1168, p. 022022. [Google Scholar]
- Brinkmann, E.M.; Burger, M.; Rasch, J.; Sutour, C. Bias reduction in variational regularization. J. Math. Imaging Vis. 2017, 59, 534–566. [Google Scholar] [CrossRef]
- Domingos, P. A unified bias-variance decomposition. In Proceedings of the 17th International Conference on Machine Learning; Morgan Kaufmann Stanford, Stanford, CA, USA, 29 June 2020; pp. 231–238. [Google Scholar]
- Geman, S.; Bienenstock, E.; Doursat, R. Neural Netw. and the Bias/Variance Dilemma. Neural Comput. 1992, 4, 1–58. [Google Scholar] [CrossRef]
- Neal, B.; Mittal, S.; Baratin, A.; Tantia, V.; Scicluna, M.; Lacoste-Julien, S.; Mitliagkas, I. A modern take on the bias-variance tradeoff in neural networks. arXiv, 2018; arXiv:1810.08591. [Google Scholar]
- Osborne, M.R.; Presnell, B.; Turlach, B.A. On the lasso and its dual. J. Comput. Graph. Stat. 2000, 9, 319–337. [Google Scholar] [CrossRef]
- Melkumova, L.E.; Shatskikh, S.Y. Comparing Ridge and LASSO estimators for data analysis. Procedia Eng. 2017, 201, 746–755. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Van Dyk, D.A.; Meng, X.L. The art of data augmentation. J. Comput. Graph. Stat. 2001, 10, 1–50. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M.; Furht, B. Text data augmentation for deep learning. J. Big Data 2021, 8, 101. [Google Scholar] [CrossRef] [PubMed]
- Feng, S.Y.; Gangal, V.; Wei, J.; Chandar, S.; Vosoughi, S.; Mitamura, T.; Hovy, E. A survey of data augmentation approaches for NLP. arXiv 2021, arXiv:2105.03075. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Jaipuria, N. Deflating dataset bias using synthetic data augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14 June 2020; pp. 772–773. [Google Scholar]
- Kim, E.; Lee, J.; Choo, J. Biaswap: Removing dataset bias with bias-tailored swapping augmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10 October 2021; pp. 14992–15001. [Google Scholar]
- Iosifidis, V.; Ntoutsi, E. Dealing with bias via data augmentation in supervised learning scenarios. Jo Bates Paul D. Clough Robert Jäschke 2018, 24. Available online: https://www.kbs.uni-hannover.de/~ntoutsi/papers/18.BIAS.pdf (accessed on 20 February 2024).
- McLaughlin, N.; Del Rincon, J.M.; Miller, P. Data-augmentation for reducing dataset bias in person re-identification. In Proceedings of the 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Karlsruhe, Germany, 25 August 2015; pp. 1–6. [Google Scholar]
- Prechelt, L. Early stopping-but when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2002; pp. 55–69. [Google Scholar]
- Li, M.; Soltanolkotabi, M.; Oymak, S. Gradient descent with early stopping is provably robust to label noise for overparameterized neural networks. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Virtual, 26 August 2020; pp. 4313–4324. [Google Scholar]
- Garbin, C.; Zhu, X.; Marques, O. Dropout vs. batch normalization: An empirical study of their impact to deep learning. Multimed. Tools Appl. 2020, 79, 12777–12815. [Google Scholar] [CrossRef]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Hanson, S.; Pratt, L. Comparing biases for minimal network construction with back-propagation. Adv. Neural Inf. Process. Syst. 1988, 1, 177–185. [Google Scholar]
- Tessier, H.; Gripon, V.; Léonardon, M.; Arzel, M.; Hannagan, T.; Bertrand, D. Rethinking Weight Decay for Efficient Neural Network Pruning. J. Imaging 2022, 8, 64. [Google Scholar] [CrossRef] [PubMed]
- Nakamura, K.; Hong, B.W. Adaptive weight decay for deep neural networks. IEEE Access 2019, 7, 118857–118865. [Google Scholar] [CrossRef]
- Deshpande, S.; Shuttleworth, J.; Yang, J.; Taramonli, S.; England, M. PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets. Comput. Biol. Med. 2019, 105, 169–181. [Google Scholar] [CrossRef] [PubMed]
- Hekayati, J.; Rahimpour, M.R. Estimation of the saturation pressure of pure ionic liquids using MLP artificial neural networks and the revised isofugacity criterion. J. Mol. Liq. 2017, 230, 85–95. [Google Scholar] [CrossRef]
- Poernomo, A.; Kang, D.-K. Biased dropout and crossmap dropout: Learning towards effective dropout regularization in convolutional neural network. Neural Netw. 2018, 104, 60–67. [Google Scholar] [CrossRef] [PubMed]
- Krishnaveni, K. A novel framework using binary attention mechanism based deep convolution neural network for face emotion recognition. Meas. Sens. 2023, 30, 100881. [Google Scholar] [CrossRef]
- Li, X.; Grandvalet, Y.; Davoine, F.; Cheng, J.; Cui, Y.; Zhang, H.; Belongie, S.; Tsai, Y.-H.; Yang, M.-H. Transfer learning in computer vision tasks: Remember where you come from. Image Vis. Comput. 2020, 93, 103853. [Google Scholar] [CrossRef]
- Koeshidayatullah, A. Optimizing image-based deep learning for energy geoscience via an effortless end-to-end approach. J. Pet. Sci. Eng. 2022, 215, 110681. [Google Scholar] [CrossRef]
- Scardapane, S.; Comminiello, D.; Hussain, A.; Uncini, A. Group sparse regularization for deep neural networks. Neurocomputing 2017, 241, 81–89. [Google Scholar] [CrossRef]
- Deakin, M.; Bloomfield, H.; Greenwood, D.; Sheehy, S.; Walker, S.; Taylor, P.C. Impacts of heat decarbonization on system adequacy considering increased meteorological sensitivity. Appl. Energy 2021, 298, 117261. [Google Scholar] [CrossRef]
- Belaïd, F.; Roubaud, D.; Galariotis, E. Features of residential energy consumption: Evidence from France using an innovative multilevel modelling approach. Energy Policy 2019, 125, 277–285. [Google Scholar] [CrossRef]
- Kong A Siou, L.; Johannet, A.; Valérie, B.E.; Pistre, S. Optimization of the generalization capability for rainfall–runoff modeling by neural networks: The case of the Lez aquifer (southern France). Environ. Earth Sci. 2012, 65, 2365–2375. [Google Scholar] [CrossRef]
- Shimomura, Y.; Komukai, S.; Kitamura, T.; Sobue, T.; Yamasaki, S.; Kondo, T.; Mizuno, S.; Harada, K.; Doki, N.; Tanaka, M.; et al. Identifying the Optimal Conditioning Intensity of Hematopoietic Stem Cell Transplantation in Patients with Acute Myeloid Leukemia in Complete Remission. Blood 2023, 142, 2150. [Google Scholar] [CrossRef]
- Yoon, K.; You, H.; Wu, W.Y.; Lim, C.Y.; Choi, J.; Boss, C.; Ramadan, A.; Popovich, J.M., Jr.; Cholewicki, J.; Reeves, N.P.; et al. Regularized nonlinear regression for simultaneously selecting and estimating key model parameters: Application to head-neck position tracking. Eng. Appl. Artif. Intell. 2022, 113, 104974. [Google Scholar] [CrossRef]
- Lawrence, A.J.; Stahl, D.; Duan, S.; Fennema, D.; Jaeckle, T.; Young, A.H.; Dazzan, P.; Moll, J.; Zahn, R. Neurocognitive measures of self-blame and risk prediction models of recurrence in major depressive disorder. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 2022, 7, 256–264. [Google Scholar] [CrossRef]
- Kauttonen, J.; Hlushchuk, Y.; Tikka, P. Optimizing methods for linking cinematic features to fMRI data. NeuroImage 2015, 110, 136–148. [Google Scholar] [CrossRef] [PubMed]
- Algamal, Z.Y.; Lee, M.H. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification. Comput. Biol. Med. 2015, 67, 136–145. [Google Scholar] [CrossRef] [PubMed]
- Hussain, S.; Anwar, S.M.; Majid, M. Segmentation of glioma tumors in brain using deep convolutional neural network. Neurocomputing 2018, 282, 248–261. [Google Scholar] [CrossRef]
- Peng, H.; Gong, W.; Beckmann, C.F.; Vedaldi, A.; Smith, S.M. Accurate brain age prediction with lightweight deep neural networks. Med. Image Anal. 2021, 68, 101871. [Google Scholar] [CrossRef]
- Vidya, B.; Sasikumar, P. Parkinson’s disease diagnosis and stage prediction based on gait signal analysis using EMD and CNN–LSTM network. Eng. Appl. Artif. Intell. 2022, 114, 105099. [Google Scholar] [CrossRef]
- Zhong, R.; Xie, X.; Luo, J.; Pan, T.; Lam, W.; Sumalee, A. Modeling double time-scale travel time processes with application to assessing the resilience of transportation systems. Transp. Res. Part B Methodol. 2020, 132, 228–248. [Google Scholar] [CrossRef]
- Jenelius, E. Personalized predictive public transport crowding information with automated data sources. Transp. Res. Part C Emerg. Technol. 2020, 117, 102647. [Google Scholar] [CrossRef]
- Tang, L.; Zhou, L.; Song, P.X.-K. Distributed simultaneous inference in generalized linear models via confidence distribution. J. Multivar. Anal. 2019, 176, 104567. [Google Scholar] [CrossRef] [PubMed]
- Ma, W.; Qian, Z. Statistical inference of probabilistic origin-destination demand using day-to-day traffic data. Transp. Res. Part C: Emerg. Technol. 2018, 88, 227–256. [Google Scholar] [CrossRef]
- Wu, J.; Zou, D.; Braverman, V.; Gu, Q. Direction matters: On the implicit bias of stochastic gradient descent with moderate learning rate. arXiv 2020, arXiv:2011.02538. [Google Scholar]
- Yildirim, H.; Özkale, M.R. The performance of ELM based ridge regression via the regularization parameters. Expert Syst. Appl. 2019, 134, 225–233. [Google Scholar] [CrossRef]
- Abdulhafedh, A. Comparison between common statistical modeling techniques used in research, including: Discriminant analysis vs. logistic regression, ridge regression vs. LASSO, and decision tree vs. random forest. OALib 2022, 9, 1–19. [Google Scholar] [CrossRef]
- Slatton, T.G. A Comparison of Dropout and Weight Decay for Regularizing Deep Neural Networks; University of Arkansas: Fayetteville, AR, USA, 2014. [Google Scholar]
- Holroyd, J.; Scaife, R.; Stafford, T. What is implicit bias? Philos. Compass 2017, 12, e12437. [Google Scholar] [CrossRef]
- Oswald, M.E.; Grosjean, S. Confirmation bias. Cognitive illusions: A handbook on fallacies and biases in thinking. Judgem. Memory 2004, 79, 83. [Google Scholar]
- Winter, L.C. Mitigation and Prediction of the Confirmation Bias in Intelligence Analysis. 2017. Available online: https://www.researchgate.net/profile/Lisa-Christina-Winter/publication/321309639_Mitigation_and_Prediction_of_the_Confirmation_Bias_in_Intelligence_Analysis/links/5b92513aa6fdccfd541fe3e0/Mitigation-and-Prediction-of-the-Confirmation-Bias-in-Intelligence (accessed on 20 February 2024).
- Heuer, R.J. Psychology of Intelligence Analysis; Center for the Study of Intelligence. 1999. Available online: https://books.google.gr/books?hl=en&lr=&id=rRXFhKAiG8gC&oi=fnd&pg=PR7&dq=Psychology+of+Intelligence+Analysis&ots=REPkPSAYsO&sig=EghU1UDFes1BiaFHTpdYyOvWNng&redir_esc=y#v=onepage&q=Psychology%20of%20Intelligence%20Analysis&f=false (accessed on 20 February 2024).
- Lord, C.G.; Lepper, M.R.; Preston, E. Considering the opposite: A corrective strategy for social judgment. J. Personal. Soc. Psychol. 1984, 47, 1231. [Google Scholar] [CrossRef] [PubMed]
- Romano, S.; Fucci, D.; Scanniello, G.; Baldassarre, M.T.; Turhan, B.; Juristo, N. On researcher bias in Software Engineering experiments. J. Syst. Softw. 2021, 182, 111068. [Google Scholar] [CrossRef]
- Biderman, S.; Scheirer, W.J. Pitfalls in Machine Learning Research: Reexamining the Development Cycle. 2020. Available online: https://proceedings.mlr.press/v137/biderman20a (accessed on 20 February 2024).
- Pinto, N.; Doukhan, D.; DiCarlo, J.J.; Cox, D.D. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput. Biol. 2009, 5, e1000579. [Google Scholar] [CrossRef] [PubMed]
- Liu, X.; Faes, L.; Kale, A.U.; Wagner, S.K.; Fu, D.J.; Bruynseels, A.; Mahendiran, T.; Moraes, G.; Shamdas, M.; Kern, C.; et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: A systematic review and meta-analysis. Lancet Digit. Health 2019, 1, e271–e297. [Google Scholar] [CrossRef] [PubMed]
- Bellamy, R.K.; Bellamy, R.K.; Dey, K.; Hind, M.; Hoffman, S.C.; Houde, S.; Kannan, K.; Lohia, P.; Martino, J.; Mehta, S.; et al. AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM J. Res. Dev. 2019, 63, 4:1–4:15. [Google Scholar] [CrossRef]
- Bird, S.; Dudík, M.; Edgar, R.; Horn, B.; Lutz, R.; Milan, V.; Sameki, M.; Wallach, H.; Walker, K. Fairlearn: A Toolkit for Assessing and Improving Fairness in AI. Microsoft, Tech. Rep. MSR-TR-2020-32. 2020. Available online: https://www.microsoft.com/en-us/research/uploads/prod/2020/05/Fairlearn_WhitePaper-2020-09-22.pdf (accessed on 15 May 2024).
- Johnson, B.; Brun, Y. Fairkit-learn: A fairness evaluation and comparison toolkit. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, Pittsburgh, PA, USA, 21 October 2022; pp. 70–74. [Google Scholar]
- Hufthammer, K.T.; Aasheim, T.H.; Ånneland, S.; Brynjulfsen, H.; Slavkovik, M. Bias mitigation with AIF360: A comparative study. In Proceedings of the NIKT: Norsk IKT-Konferanse for Forskning og Utdanning, Virtual, 24 November 2020; Norsk IKT-Konferanse for Forskning og Utdanning: Norway, 2020. [Google Scholar]
- Weerts, H.; Dudík, M.; Edgar, R.; Jalali, A.; Lutz, R.; Madaio, M. Fairlearn: Assessing and improving fairness of ai systems. arXiv 2023, arXiv:2303.16626. [Google Scholar]
- Gu, J.; Oelke, D. Understanding bias in machine learning. arXiv 2019, arXiv:1909.01866. [Google Scholar]
- Sengupta, E.; Garg, D.; Choudhury, T.; Aggarwal, A. Techniques to elimenate human bias in machine learning. In Proceedings of the 2018 International Conference on System Modeling & Advancement in Research Trends (SMART), Moradabad, India, 23 November 2018; pp. 226–230. [Google Scholar]
- Hort, M.; Chen, Z.; Zhang, J.M.; Harman, M.; Sarro, F. Bias mitigation for machine learning classifiers: A comprehensive survey. Acm, J. Respon. Comput. 2023, 1, 1–52. [Google Scholar] [CrossRef]
- Pagano, T.F.; Loureiro, R.B.; Lisboa, F.V.; Peixoto, R.M.; Guimarães, G.A.; Cruz, G.O.; Araujo, M.M.; Santos, L.L.; Cruz, M.A.; Oliveira, E.L.; et al. Bias and unfairness in machine learning models: A systematic review on datasets, tools, fairness metrics, and identification and mitigation methods. Big Data Cognit. Comput. 2023, 7, 15. [Google Scholar]
- Suri, J.S.; Bhagawati, M.; Paul, S.; Protogeron, A.; Sfikakis, P.P.; Kitas, G.D.; Khanna, N.N.; Ruzsa, Z.; Sharma, A.M.; Saxena , S.; et al. Understanding the bias in machine learning systems for cardiovascular disease risk assessment: The first of its kind review. Comput. Biol. Med. 2022, 142, 105204. [Google Scholar]
- Li, F.; Wu, P.; Ong, H.H.; Peterson, J.F.; Wei, W.Q.; Zhao, J. Evaluating and mitigating bias in machine learning models for cardiovascular disease prediction. J. Biomed. Inform. 2023, 138, 104294. [Google Scholar] [CrossRef] [PubMed]
- Zhang, K.; Khosravi, B.; Vahdati, S.; Faghani, S.; Nugen, F.; Rassoulinejad-Mousavi, S.M.; Moassefi, M.; Jagtap, J.M.M.; Singh, Y.; Rouzrokh, P.; et al. Mitigating bias in radiology machine learning: 2. Model development. Radiol. Artif. Intell. 2022, 4, e220010. [Google Scholar] [CrossRef] [PubMed]
- Mavrogiorgou, A.; Kleftakis, S.; Mavrogiorgos, K.; Zafeiropoulos, N.; Menychtas, A.; Kiourtis, A.; Maglogiannis, I.; Kyriazis, D. beHEALTHIER: A microservices platform for analyzing and exploiting healthcare data. In Proceedings of the 34th International Symposium on Computer-Based Medical Systems, Virtual, 7 June 2021; pp. 283–288. [Google Scholar]
- Biran, O.; Feder, O.; Moatti, Y.; Kiourtis, A.; Kyriazis, D.; Manias, G.; Mavrogiorgou, A.; Sgouros, N.M.; Barata, M.T.; Oldani, I.; et al. PolicyCLOUD: A prototype of a cloud serverless ecosystem for policy analytics. Data Policy 2022, 4. [Google Scholar] [CrossRef]
- Kiourtis, A.; Poulakis, Y.; Karamolegkos, P.; Karabetian, A.; Voulgaris, K.; Mavrogiorgou, A.; Kyriazis, D. Diastema: Data-driven stack for big data applications management and deployment. Int. J. Big Data Manag. 2023, 3, 1–27. [Google Scholar] [CrossRef]
- Reščič, N.; Alberts, J.; Altenburg, T.M.; Chinapaw, M.J.; De Nigro, A.; Fenoglio, D.; Gjoreski, M.; Gradišek, A.; Jurak, G.; Kiourtis, A.; et al. SmartCHANGE: AI-based long-term health risk evaluation for driving behaviour change strategies in children and youth. In Proceedings of the International Conference on Applied Mathematics & Computer Science, Lefkada Island, Greece, 8 August 2023; pp. 81–89. [Google Scholar]
- Mavrogiorgou, A.; Kiourtis, A.; Makridis, G.; Kotios, D.; Koukos, V.; Kyriazis, D.; Soldatos, J.; Fatouros, G.; Drakoulis, D.; Maló, P.; et al. FAME: Federated Decentralized Trusted Data Marketplace for Embedded Finance. In Proceedings of the International Conference on Smart Applications, Communications and Networking, Istanbul, Turkey, 25 July 2023; pp. 1–6. [Google Scholar]
- Manias, G.; Apostolopoulos, D.; Athanassopoulos, S.; Borotis, S.; Chatzimallis, C.; Chatzipantelis, T.; Compagnucci, M.C.; Draksler, T.Z.; Fournier, F.; Goralczyk, M.; et al. AI4Gov: Trusted AI for Transparent Public Governance Fostering Democratic Values. In Proceedings of the 2023 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT), Pafos, Cyprus, 19 June 2023; pp. 548–555. [Google Scholar]
Section | Description |
---|---|
Section 1 | Introduction to the scope of this manuscript and the overall work conducted |
Section 2 | Methodology followed to carry out the comprehensive literature review |
Section 3 | Analysis of approaches for identifying and mitigating data bias |
Section 3.1 | Analysis of approaches for identifying cognitive bias in data |
Section 3.2 | Analysis of approaches for identifying selection bias in data |
Section 3.3 | Analysis of approaches for identifying reporting bias in data |
Section 3.4 | Summarization of the most common approaches for mitigating data bias based on the findings of Section 3.1, Section 3.2 and Section 3.3 |
Section 4 | Analysis of approaches for identifying and mitigating algorithm bias |
Section 4.1 | Research regarding the different estimators that have been utilized in the literature, in diverse domains, and the way that they may introduce bias |
Section 4.2 | Research regarding the different optimizers that have been utilized in the literature, in diverse domains, and the way that they may introduce bias |
Section 4.3 | Research regarding the different regularization methods that have been utilized in the literature, in diverse domains, and the way that they may introduce bias |
Section 4.4 | Summarization of the most common approaches for mitigating algorithm bias based on the findings of Section 4.1, Section 4.2 and Section 4.3 |
Section 5 | Analysis of approaches for identifying and mitigating engineer bias |
Section 6 | Discussion of the findings of this manuscript and comparison to other literature reviews about bias |
Section 7 | Summarization of the findings of this manuscript and provision of future research directions with regard to bias in ML |
Bias Type | Publications Database | Search Query |
---|---|---|
Reporting | ACM | [[Abstract: “reporting bias”] OR [Abstract: “reporting biases”]] AND [Abstract: “machine learning”] AND [[Abstract: mitigation] OR [Abstract: mitigating] OR [Abstract: identifying] OR [Abstract: identification] OR [Abstract: addressing]] |
IEEE Xplore | ((“Abstract”:reporting bias) OR (“Abstract”:reporting biases)) AND (“Abstract”:machine learning) AND ((“Abstract”:mitigation) OR (“Abstract”:mitigating) OR (“Abstract”:identifying) OR (“Abstract”:identification) OR (“Abstract”:addressing)) | |
Scopus | TITLE-ABS-KEY ((“reporting bias” OR “reporting biases”) AND “machine learning” AND (mitigation OR mitigating OR identifying OR identification OR addressing)) | |
Science Direct | Title, abstract, keywords: ((“reporting bias” OR “reporting biases”) AND “machine learning” AND (mitigation OR mitigating OR identifying OR addressing)) | |
Selection | ACM | [[Abstract: “selection bias”] OR [Abstract: “selection biases”]] AND [Abstract: “machine learning”] AND [[Abstract: mitigation] OR [Abstract: mitigating] OR [Abstract: identifying] OR [Abstract: identification] OR [Abstract: addressing]] |
IEEE Xplore | ((“Abstract”:selection bias) OR (“Abstract”:selection biases)) AND (“Abstract”:machine learning) AND ((“Abstract”:mitigation) OR (“Abstract”:mitigating) OR (“Abstract”:identifying) OR (“Abstract”:identification) OR (“Abstract”:addressing)) | |
Scopus | TITLE-ABS-KEY ((“selection bias” OR “selection biases”) AND “machine learning” AND (mitigation OR mitigating OR identifying OR identification OR addressing)) | |
Science Direct | Title, abstract, keywords: ((“cognitive bias” OR “cognitive biases”) AND “machine learning” AND (mitigation OR mitigating OR identifying OR addressing)) | |
Cognitive | ACM | [[Abstract: “cognitive bias”] OR [Abstract: “cognitive biases”]] AND [Abstract: “machine learning”] AND [[Abstract: mitigation] OR [Abstract: mitigating] OR [Abstract: identifying] OR [Abstract: identification] OR [Abstract: addressing]] |
IEEE Xplore | ((“Abstract”:cognitive bias) OR (“Abstract”:cognitive biases)) AND (“Abstract”:machine learning) AND ((“Abstract”:mitigation) OR (“Abstract”:mitigating) OR (“Abstract”:identifying) OR (“Abstract”:identification) OR (“Abstract”:addressing)) | |
Scopus | TITLE-ABS-KEY ((“cognitive bias” OR “cognitive biases”) AND “machine learning” AND (mitigation OR mitigating OR identifying OR addressing)) | |
Science Direct | Title, abstract, keywords: ((“cognitive bias” OR “cognitive biases”) AND “machine learning” AND (mitigation OR mitigating OR identifying OR addressing)) | |
Estimators | ACM | [[Abstract: “comparative study”] OR [Abstract: “review”]] AND [Abstract: “machine learning”] AND [Abstract: “algorithm”] AND [Abstract: “bias”] |
IEEE Xplore | ((“Abstract”:”comparative study” OR “Abstract”:”review”) AND (“Abstract”:”machine learning”) AND (“Abstract”:”algorithm”) AND (“Abstract”:”bias”)) | |
Scopus | TITLE-ABS-KEY (“comparative study” AND “machine learning” AND “algorithm” AND “bias”) AND (LIMIT-TO (SUBJAREA, “COMP”)) | |
Science Direct | Title, abstract, keywords: (“comparative study” AND “machine learning” AND “algorithm” AND “bias”) | |
Optimizers | ACM | [Abstract: “optimizer”] AND [Abstract: “hyperparameter”] AND [Abstract: “machine learning”] AND [Abstract: “neural network” AND [Abstract: “bias”] |
IEEE Xplore | (((“Abstract”:optimizer) AND (“Abstract”:hyperparameter) AND (“Abstract”:machine learning) AND (“Abstract”:neural network) AND (“Abstract”:bias))) | |
Scopus | TITLE-ABS-KEY (“optimizer” AND “hyperparameter” AND “machine learning” AND “neural network” AND “bias”) | |
Science Direct | Title, abstract, keywords: (“optimizer” AND “hyperparameter” AND “machine learning” AND “neural network” AND “bias”) | |
Regularization | ACM | [Abstract: “regularization”] AND [Abstract: “machine learning”] AND [Abstract: “bias”] AND [[Abstract: “lasso”] OR [Abstract: “ridge”] OR [Abstract: “elastic net”] OR [Abstract: “data augmentation”] OR [Abstract: “early stopping”] OR [Abstract: “dropout”] OR [Abstract: “weight decay”]] |
IEEE Xplore | ((“Abstract”:”regularization”) AND (“Abstract”:”machine learning”) AND ((“Abstract”:”lasso”) OR (“Abstract”:”ridge”) OR (“Abstract”:”elastic net”) OR (“Abstract”:”data augmentation”) OR (“Abstract”:”early stopping”) OR (“Abstract”:”dropout)” OR (“Abstract”: “weight decay”))) | |
Scopus | TITLE-ABS-KEY (“regularization” AND “machine learning” AND “bias” AND (“lasso” OR “ridge” OR “elastic net” OR “data augmentation” OR “early stopping” OR “dropout” OR “weight decay”)) | |
Science Direct | Title, abstract, keywords: (“regularization” AND “machine learning” AND “bias” AND (“lasso” OR “ridge” OR “elastic net” OR “data augmentation” OR “early stopping” OR “dropout” OR “weight decay”)) |
ID | Year | Ref. | Domain | Data Category | Type |
---|---|---|---|---|---|
CB1 | 2023 | [25] | Natural Language Processing (i.e., NLP) (Generic) | Text | Identification |
CB2 | 2023 | [27] | NLP-Generic | Text | Identification |
CB3 | 2020 | [28] | Generic | Not specific | Identification |
CB4 | 2024 | [29] | NLP (Business) | Text | Identification and Mitigation |
CB5 | 2020 | [30] | NLP (Business) | Text | Identification and Mitigation |
CB6 | 2024 | [31] | Generic | Not specific | Identification and Mitigation |
CB7 | 2022 | [32] | Computer Vision | Images | Identification and Mitigation |
CB8 | 2024 | [33] | NLP (Health) | Text | Identification and Mitigation |
ID | Year | Ref. | Domain | Data Category | Type |
---|---|---|---|---|---|
SB1 | 2019 | [39] | Environment | Time series | Identification |
SB2 | 2023 | [40] | Environment | Observations (numeric) | Identification |
SB3 | 2021 | [41] | Environment | Time series | Identification and Mitigation |
SB4 | 2018 | [42] | Computer Vision | Images | Identification and Mitigation |
SB5 | 2022 | [43] | Computer Vision | Images | Identification and Mitigation |
SB6 | 2023 | [44] | Computer Vision (Health) | Images | Identification and Mitigation |
SB7 | 2019 | [45] | Health | EHR data (numeric and text) | Identification and Mitigation |
SB8 | 2023 | [46] | Health | Mixed (text and numeric) | Identification and Mitigation |
SB9 | 2023 | [47] | Health | Numeric | Identification and Mitigation |
SB10 | 2023 | [48] | Health | Text (NLP—Q&A) | Identification and Mitigation |
SB11 | 2021 | [49] | Health | Mixed (text and numeric) | Identification and Mitigation |
SB12 | 2017 | [50] | Not specific | Not specific | Identification and Mitigation |
SB13 | 2023 | [51] | Mixed (text and numeric) | Several (binary classification) | Identification |
ID | Year | Ref. | Domain | Data Category | Type |
---|---|---|---|---|---|
RB1 | 2022 | [59] | Not specific | Text | Identification and Mitigation |
RB2 | 2023 | [60] | Computer Vision | Text and Images | Identification and Mitigation |
RB3 | 2021 | [61] | Computer Vision | Images | Identification and Mitigation |
RB4 | 2016 | [62] | Computer Vision | Images | Identification and Mitigation |
RB5 | 2021 | [63] | Facial Recognition | Images | Identification and Mitigation |
RB6 | 2018 | [64] | Security | Numeric | Identification and Mitigation |
RB7 | 2019 | [65] | Security (Malware Detection) | Numeric | Identification and Mitigation |
RB8 | 2023 | [66] | Health | Numeric | Identification and Mitigation |
RB9 | 2023 | [67] | Health | Numeric | Identification and Mitigation |
RB10 | 2020 | [68] | Health | Numeric | Identification and Mitigation |
RB11 | 2023 | [69] | Education | Numeric | Identification and Mitigation |
RB12 | 2020 | [70] | Education | Numeric | Identification and Mitigation |
RB13 | 2019 | [71] | Aerospace | Prognostic Data | Identification and Mitigation |
RB14 | 2017 | [72] | Biology | Mixed | Identification and Mitigation |
RB15 | 2023 | [73] | Environment | Time series data | Identification and Mitigation |
RB16 | 2023 | [74] | Generic | Mixed | Identification and Mitigation |
RB17 | 2021 | [75] | Finance | Numeric | Identification and Mitigation |
RB18 | 2018 | [76] | Social media | Text | Identification |
Algorithm (Estimator) Name | Abbreviation | Description |
---|---|---|
Autoregressive Moving Average | ARIMA | Mostly used in time series forecasting, where the current value of the series can be explained as a function of past values [77] |
Adaptive Neuro-Fuzzy Inference System | ANFIS | Type of a Neural Network that is based on the Takagi-Sugeno fuzzy system [78] |
Bee Colony Optimization | BCO | System that consists of multiple agents for solving complex optimization problems [79] |
Bidirectional Long Short-Term Memory Neural Network | Bi-LSTM NN | Specific type of LSTM neural network that enables additional training by traversing the input data [80] |
Cascade Support Vector Machine | C-SVM | Ensemble ML technique that consists of several SVMs stacked in a cascade [81] |
C4.5 Classifier | C4.5 | Type of a decision tree classifier [82] |
Classification and regression tree | CART | Type of a decision tree algorithm that can be used for both classification and regression tasks [83] |
CNN-Bi-LSTM | CNN-Bi-LSTM | Type of a Bi-LSTM NN that also utilizes a convolutional layer [84] |
Conditional Random Fields | CRF | Statistical modeling method that is mostly used in pattern recognition and ML for structured predictions, since it models predictions into graphs [85] |
Convolution Long Short-Term Memory | CLSTM | Type of an LSTM NN that also utilizes a convolutional layer [86] |
Convolutional Neural Network | CNN | A type of feed-forward NN that consists of at least one convolutional layer and is widely used in computer vision tasks [87] |
Cox proportional hazards model | Cox regression | Type of a survival model. That kind of models try to associate time prior to one event happening with one or more covariates [88] |
Cubist | Cubist | Rule-based model that contains a tree whose final leaves use linear regression models [89] |
Decision Trees | DT | Fundamental ML algorithm that uses a tree-like model of decisions and can be used for both regression and classification tasks [90] |
Deep Neural Network | DNN | A type of an NN that has multiple hidden layers between the input and the output [91] |
Double Deep Q-Network | DDQN | Consists of two separate Q-networks that are a type of reinforcement learning algorithms [92] |
Exponential smoothing methods | ESM | Method used in time series forecasting that utilizes an exponentially weighted average of past values to predict a future value [93] |
Extra Trees | ET | An ensemble ML approach that utilizes multiple randomized decision trees [94] |
Extreme Gradient Boosting | XGBoost | Iteratively combines predictions of multiple individual models, usually DTs [95] |
Gene Expression Programming | GEP | Type of evolutionary algorithm that consist of complex trees that adapt and alter their structures [96] |
Gradient Boosted Regression Trees | GBRT | Type of additive model that makes predictions by combining decisions from a set of other models [97] |
Gated Recurrent Unit Networks | GRU | Type of Recurrent Neural Network, similar to LSTM, that utilize less parameters, thus having less computational cost [98] |
k-Nearest Neighbors | KNN | A fundamental ML algorithm used for classification and regression based on proximity of data points [99] |
Light Gradient Boosting Machine | LightGBM | Ensemble method that is based on gradient boosting [100] |
Linear Discriminant Analysis | LDA | Supervised ML algorithm for classification that aims to identify linear set of features that identify classes into a dataset [101] |
Linear Regression | LinearR | Fundamental algorithm that estimates the linear relationship between two variables [102] |
Linear Support Vector Classification | LinearSVC | Subtype of SVM used for perfectly linear data [103] |
Logistic Regression | LogR | Used for estimating the parameters of logistic models [104] |
Long-Short Term Memory | LSTM | Type of recurrent neural network that is particularly useful for time series forecasting, since it is capable to “remember” [105] |
Multi-Layer Perceptron neural network | MLP | Subtype of a DNN [106] |
Multinomial Naive Bayes | MNB | Probabilistic classifier that is based on the Bayes’ theorem and focuses on calculation of text data’s distribution [107] |
Neural Network | NN | Structure that is made of artificial neurons that receive signals from other connected neurons and create an output that can be forwarded to other neurons [108] |
Random Forest | RF | Fundamental ML algorithm that makes use of the output of multiple decision trees for classification, regression and other tasks [109] |
Squeaky Wheel Optimization | SWO | An optimization technique based on a greedy algorithm [110] |
Seasonal and Trend decomposition using Loess | STL | Method used in time series where linear regression models are applied to decompose a time series into separate components [111] |
Support Vector Machines | SVM | Classification algorithm that finds an optimal line or hyperplane to maximize the distance between each class [112] |
Temporal Difference (lambda) | TD | Reinforcement learning method that share common characteristics with Monte Carlo method and Dynamic Programming methods [113] |
Term Frequency Inverse Document Frequency of records | TF-IDF | A measure calculating the relevancy of a word in a series of words or entire corpus [114] |
Value-Biased Stochastic Search | VBSS | Randomization method that can be used for determining stochastic bias [115] |
Vector Autoregression | VAR | Model that can be used for multiple time series that influence each other [116] |
ID | Year | Ref. | Domain | Data Category | Estimators | Most Suitable Estimator |
---|---|---|---|---|---|---|
CS1 | 2022 | [117] | Agriculture | Spectra Data | RF, SVM, Cubist | Ensemble Method |
CS2 | 2009 | [118] | Automobiles | Mixed (text–numeric) | SWO, VBSS, BCO, TD | TS and SWO |
CS3 | 2016 | [119] | Biology | Numeric | SVM, KNN, DT | SVM |
CS4 | 2020 | [120] | Business | Text (reviews) | MNB, CLSTM | CLSTM |
CS5 | 2022 | [121] | Business | Numeric | RF, SVM, DNN | DNN |
CS6 | 2020 | [122] | Business | Time series | STL decomposition, ESM, ARIMA, LinearR, KNN, SVR, VAR, DT | VAR or ARIMA (based on missing values presence) |
CS7 | 2015 | [123] | Computer Vision | Remote Sensing | RF, ET, GBRT | ET |
CS8 | 2024 | [124] | Construction | Numeric | XGBoost, LightGBM, GEP | GEP |
CS9 | 2018 | [125] | Environment | Time series | CART, ANFIS, MLP NN | CART |
CS10 | 2021 | [126] | Environement | Seismic data | Dense NN, CNN, LSTM | CNN and LSTM |
CS11 | 2022 | [127] | Financial | Numeric | DDQN, SVM, KNN, RF, XGBoost | KNN |
CS12 | 2022 | [128] | Health | Numeric | LogR, NN, SGD, RF, NB, KNN, DT | LogR |
CS13 | 2021 | [129] | Health | Text | LinearSVC, LogR, MNB, RF | LogR with TF-IDF |
CS14 | 2020 | [130] | Health | Numeric | Cox regression, NN | NN |
CS15 | 2017 | [131] | Health | Numeric | SVM, NN, NB | SVM |
CS16 | 2021 | [132] | Health | Numeric | NN, C4.5 Classifier, NB, KNN, Logistic Classifier, SVM, DNN. | SVM |
SC17 | 2020 | [133] | Health | Numeric | LogR, DT, RF, NN | RF |
CS18 | 2022 | [134] | Health | Numeric | SVM, DT | SVM |
CS19 | 2021 | [135] | Health | Images | Simple NN, LSTM | LSTM |
CS20 | 2023 | [136] | Sentiment Analysis | Text | CNN-Bi-LSTM, LSTM, Bi-LSTM, CNN, GRU, RF, SVM | CNN-Bi-LSTM |
CS21 | 2012 | [137] | Smart Homes | Sensor Data | C-SVM, CRF, LDA | C-SVM |
ID | Year | Ref. | Domain | Data Category | Optimizers | Most Suitable Optimizer |
---|---|---|---|---|---|---|
OCS1 | 2021 | [177] | Agriculture | Numeric | SGD, RMSprop, Adam | SGD |
OCS2 | 2023 | [178] | Agriculture | Time Series Data | Adam, SGD | SGD |
OCS3 | 2024 | [179] | Business | Numeric | Adam, SGD | Adam |
OCS4 | 2022 | [180] | Civil | Numeric | Adagrad, Adadelta, RMSProp, SGD, Nadam, Adam | SGD |
OCS5 | 2021 | [181] | Computer Vision | Images | Nadam, Adam, Adamax | Nadam |
OCS6 | 2021 | [182] | Computer Vision | Images | Adam, SGD | Adam |
OCS7 | 2021 | [183] | Health (diabetes) | Numeric | SGD, Adagrad, Adam, RMSProp | RMSProp |
OCS8 | 2023 | [184] | Health | Numeric | SGD, RMSProp | SGD |
OCS9 | 2023 | [185] | Security | Numeric | Adam, Adagrad, Adamax, Ftrl, Nadam, RMSProp, SGD | Nadam |
Optimizer | Recommended Usage |
---|---|
Adam | Training time needs to be reduced |
RMSProp | Memory requirements are not important |
SGD | There is no time constraint |
Nadam | Adam does not produce sufficient results |
ID | Year | Ref. | Domain | Data Category | Regularization Technique |
---|---|---|---|---|---|
RS1 | 2019 | [210] | Biology | Numeric | Lasso |
RS2 | 2017 | [211] | Chemisrty | Numeric | Early stopping |
RS3 | 2018 | [212] | Computer Vision | Images | Dropout |
RS4 | 2023 | [213] | Computer Vision | Images | Weight Decay and dropout |
RS5 | 2020 | [214] | Computer Vision | Images | Weight Decay |
RS6 | 2022 | [215] | Computer Vision (Geology) | Images | Data augmentation |
RS7 | 2017 | [216] | Computer vision | Images | Lasso |
RS8 | 2021 | [217] | Energy | Time series Data | Lasso |
RS9 | 2019 | [218] | Energy | Numeric | Elastic Net |
RS10 | 2012 | [219] | Environment | Numeric | Early stopping and weight decay |
RS11 | 2023 | [220] | Health | Numeric | Lasso |
RS12 | 2022 | [221] | Health | Images | Lasso |
RS13 | 2022 | [222] | Health | Images | Elastic Net |
RS14 | 2015 | [223] | Health | Images | Elastic Net |
RS15 | 2015 | [224] | Health | Numeric | Elastic Net |
RS16 | 2018 | [225] | Health | Images | Dropout |
RS17 | 2021 | [226] | Health | Images | Data augmentation and dropout |
RS18 | 2022 | [227] | Health | Signals | L2 and Dropout |
RS19 | 2020 | [228] | Transport | Vehicle Data | Lasso |
RS20 | 2020 | [229] | Transport | Numeric | Lasso |
RS21 | 2020 | [230] | Transport | Numeric | Lasso and Elastic Net |
RS22 | 2018 | [231] | Networks | Numeric | Lasso |
RS23 | 2020 | [232] | Not Specific | Not Specific | Early stopping |
RS24 | 2019 | [233] | Not Specific | Numeric | L2 |
Regularization Method | Recommended Usage |
---|---|
Lasso | Most feature coefficients are equal to zero |
Ridge | Most feature coefficients are not equal to zero |
Elastic Net | Need a combination of Lasso and Ridge |
Data Augmentation | Image and text data with not sufficient samples |
Dropout | Complex NN architectures and less complex data |
Weight Decay | Dropout is not suitable |
Early Stopping | Good practice as long as the threshold is set just before the model starts overfitting |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mavrogiorgos, K.; Kiourtis, A.; Mavrogiorgou, A.; Menychtas, A.; Kyriazis, D. Bias in Machine Learning: A Literature Review. Appl. Sci. 2024, 14, 8860. https://doi.org/10.3390/app14198860
Mavrogiorgos K, Kiourtis A, Mavrogiorgou A, Menychtas A, Kyriazis D. Bias in Machine Learning: A Literature Review. Applied Sciences. 2024; 14(19):8860. https://doi.org/10.3390/app14198860
Chicago/Turabian StyleMavrogiorgos, Konstantinos, Athanasios Kiourtis, Argyro Mavrogiorgou, Andreas Menychtas, and Dimosthenis Kyriazis. 2024. "Bias in Machine Learning: A Literature Review" Applied Sciences 14, no. 19: 8860. https://doi.org/10.3390/app14198860
APA StyleMavrogiorgos, K., Kiourtis, A., Mavrogiorgou, A., Menychtas, A., & Kyriazis, D. (2024). Bias in Machine Learning: A Literature Review. Applied Sciences, 14(19), 8860. https://doi.org/10.3390/app14198860