Machine Learning-Based Regression Framework to Predict Health Insurance Premiums
Abstract
:1. Introduction
- The use of chatbots has become an increasingly important aspect of any firm; even healthcare organisations are embracing the technology. Because almost everyone has access to the Internet and a smartphone, interacting with physicians, hospitals, and insurance companies is much easier using chat applications. They are available 24 h a day, seven days a week, making them more effective than human interaction. They employ emotional analysis and natural language processing to better comprehend consumers’ requests and respond to a variety of queries about insurance claims and product choices.
- Faster Claim Settlements: The time it takes for health insurance claims to be settled is one of the main difficulties for both policyholders and insurers. This might be due to lengthy manual processes or bogus claims. It takes time and effort to manually identify valid claims. However, AI has the potential to significantly lower claim processing times in the future. AI can detect fraudulent claims and learn from previous data to improve efficiency significantly.
- Personalised Health Insurance Policies: On the basis of an individual’s past data and current health circumstances, insurers can identify and develop a health insurance plan for them. This assists the insurer in providing a proper health insurance plan rather than a health insurance package that clients may or may not utilise efficiently. Customers will also be urged to select a plan that meets their requirements rather than paying for services they may not use.
- Cost-effectiveness: Insurers are utilising AI to recommend good habits and behaviours to clients, such as exercise and diet, lowering the cost of avoidable healthcare expenditures caused by bad habits.
- Fraud Detection: Researchers are working on building machines that can evaluate health insurance claims and anticipate fraud. This also aids insurers in resolving legitimate claims more quickly.
- Faster Underwriting: The health insurance underwriting procedure is lengthy and time-consuming. Fitness trackers, for example, can now collect and analyse vast amounts of data and share it with insurance companies thanks to technological breakthroughs, such as smart wearable technologies. Insurers can find innovative methods to underwrite consumers differently by employing these data. By adopting AI-based predictive analysis, health insurance firms may save time and money.
- Clinical Observation-Based Decisions: AI and machine learning can process vast volumes of data in real time and give critical information that can aid in patient diagnosis and treatment recommendations. This translates to improved healthcare services at a reduced cost by evaluating patient data and delivering findings in a couple of minutes. Diabetes or blood sugar devices, for example, may analyse data rather than merely reading raw data and alert you to patterns depending on the information presented, allowing you to take immediate or corrective action.
- Increased Accessibility: While affluent countries can offer healthcare to the majority of their citizens, underdeveloped countries may struggle. This is owing to a technological gap in healthcare, which results in a drop in the respective country’s health index. Reaching out to individuals in the farthest reaches of the globe is an important task, and the risk of healthcare deprivation is growing. By establishing an efficient healthcare system, AI can assist to alleviate this problem. Digital healthcare will help bridge the gap between poor and wealthy countries by allowing people to better understand their symptoms and obtain treatment as soon as possible.
- Helps Reveal Early Illness Risks: AI can evaluate enormous amounts of patient medical data and compile it all in one location, which can help reveal early illness risks. It may examine prior and current health issues using the information. Doctors may compare the data and make an accurate diagnosis, allowing them to deliver the best therapy possible. With a large amount of data in one location, AI-powered healthcare applications can assess a wide range of symptoms, diagnose ailments, and potentially forecast future illnesses.
- Early Detection of Illness: Artificial intelligence can learn from data, such as diagnoses, medical reports, and photographs. This helps detect the beginning of ailments over time as well as implement preventative and mitigation measures.
- Artificial intelligence also saves time and money by reducing the time and effort required to evaluate and diagnose an ailment. Instead of waiting for a doctor’s consultation to diagnose your sickness, AI will be able to analyse and offer correct inputs to the doctor, allowing the doctor to make the best decision possible and minimising the time it takes to deliver early treatment. People may not need to visit many laboratories for diagnosis if AI can read and evaluate the condition.
- Expediting Processes: By streamlining visits, interpreting clinical notes, and recording patient notes and treatment plans, AI can assist clinicians in decreasing their administrative load. The benefits of AI in healthcare are numerous since it simplifies operations and offers reliable data in less time.
- Improve Drug Development: Drug development can take a long time and sometimes miss deadlines for pharmaceutical companies to deliver the proper formula. On the other hand, drug development has never been faster than it is now, thanks to AI. AI allows scientists to concentrate on creating treatments that are both promising and relevant to the needs of patients. It saves time and money when creating medications that might save lives in an emergency.
- In undeveloped or neglected nations, healthcare access is limited.
- Electronic health records are less burdensome.
- Antibiotic resistance threats are being reduced.
- Insurance claims are processed faster.
- Plans for individual health insurance.
- This domain of insurance prediction is not fully explored and requires thorough research. From the proposed machine learning model, patients, hospitals, physicians, and insurance providers could benefit and accomplish their tasks faster and more efficiently.
- The authors trained an ANN-based regression model to predict health insurance premiums.
- The model was evaluated against key performance metrics, such as RMSE, MSE, MAE, r2, and adjusted r2.
- The overall accuracy of the proposed model was 92.72%.
- The correlation matrix was plotted to visualise the relationship between various factors with the charges.
2. Related Work
3. Research Methodology
3.1. Step 1: Performing the Data Analysis and Feature Engineering
3.2. Step 2: Data Visualisation
3.3. Step 3: Training and Evaluating a Linear Regression Model
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Health Insurance Premium Prediction with Machine Learning. Available online: https://thecleverprogrammer.com/2021/10/26/health-insurance-premium-prediction-with-machine-learning/ (accessed on 9 May 2022).
- ul Hassan, C.A.; Iqbal, J.; Hussain, S.; AlSalman, H.; Mosleh, M.A.A.; Sajid Ullah, S. A Computational Intelligence Approach for Predicting Medical Insurance Cost. Math. Probl. Eng. 2021, 2021, 1162553. [Google Scholar] [CrossRef]
- Cevolini, A.; Esposito, E. From Pool to Profile: Social Consequences of Algorithmic Prediction in Insurance. Big Data Soc. 2020, 7. [Google Scholar] [CrossRef]
- van den Broek-Altenburg, E.M.; Atherly, A.J. Using Social Media to Identify Consumers’ Sentiments towards Attributes of Health Insurance during Enrollment Season. Appl. Sci. 2019, 9, 2035. [Google Scholar] [CrossRef] [Green Version]
- Hanafy, M.; Mahmoud, O.M.A. Predict Health Insurance Cost by Using Machine Learning and DNN Regression Models. Int. J. Innov. Technol. Explor. Eng. 2021, 10, 137–143. [Google Scholar] [CrossRef]
- Bhardwaj, N.; Anand, R. Health Insurance Amount Prediction. Int. J. Eng. Res. 2020, 9, 1008–1011. [Google Scholar] [CrossRef]
- Boodhun, N.; Jayabalan, M. Risk Prediction in Life Insurance Industry Using Supervised Learning Algorithms. Complex Intell. Syst. 2018, 4, 145–154. [Google Scholar] [CrossRef] [Green Version]
- Goundar, S.; Prakash, S.; Sadal, P.; Bhardwaj, A. Health Insurance Claim Prediction Using Artificial Neural Networks. Int. J. Syst. Dyn. Appl. 2020, 9, 40–57. [Google Scholar] [CrossRef]
- Ejiyi, C.J.; Qin, Z.; Salako, A.A.; Happy, M.N.; Nneji, G.U.; Ukwuoma, C.C.; Chikwendu, I.A.; Gen, J. Comparative Analysis of Building Insurance Prediction Using Some Machine Learning Algorithms. Int. J. Interact. Multimed. Artif. Intell. 2022, 7, 75–85. [Google Scholar] [CrossRef]
- Rustam, Z.; Yaurita, F. Insolvency Prediction in Insurance Companies Using Support Vector Machines and Fuzzy Kernel C-Means. J. Phys. Conf. Ser. 2018, 1028, 012118. [Google Scholar] [CrossRef]
- Fauzan, M.A.; Murfi, H. The Accuracy of XGBoost for Insurance Claim Prediction. Int. J. Adv. Soft Comput. Appl. 2018, 10, 159–171. Available online: https://www.claimsjournal.com/news/national/2013/11/21/240353.htm (accessed on 9 May 2022).
- Rukhsar, L.; Bangyal, W.H.; Nisar, K.; Nisar, S. Prediction of Insurance Fraud Detection Using Machine Learning Algorithms. Mehran Univ. Res. J. Eng. Technol. 2022, 41, 33–40. Available online: https://search.informit.org/doi/epdf/10.3316/informit.263147785515876 (accessed on 9 May 2022). [CrossRef]
- Kumar Sharma, D.; Sharma, A. Prediction of Health Insurance Emergency Using Multiple Linear Regression Technique. Eur. J. Mol. Clin. Med. 2020, 7, 98–105. [Google Scholar]
- Azzone, M.; Barucci, E.; Giuffra Moncayo, G.; Marazzina, D. A Machine Learning Model for Lapse Prediction in Life Insurance Contracts. Expert Syst. Appl. 2022, 191, 116261. [Google Scholar] [CrossRef]
- Sun, J.J. Identification and Prediction of Factors Impact America Health Insurance Premium. Master’s Thesis, National College of Ireland, Dublin, Ireland, 2020. Available online: http://norma.ncirl.ie/4373/ (accessed on 9 May 2022).
- Lui, E. Employer Health Insurance Premium Prediction. Available online: http://cs229.stanford.edu/proj2012/Lui-EmployerHealthInsurancePremiumPrediction.pdf (accessed on 17 May 2022).
- Prediction of Health Expense—Predict Health Expense Data. Available online: https://www.analyticsvidhya.com/blog/2021/05/prediction-of-health-expense/ (accessed on 9 May 2022).
- Takeshima, T.; Keino, S.; Aoki, R.; Matsui, T.; Iwasaki, K. Development of Medical Cost Prediction Model Based on Statistical Machine Learning Using Health Insurance Claims Data. Value Health 2018, 21, S97. [Google Scholar] [CrossRef] [Green Version]
- Yang, C.; Delcher, C.; Shenkman, E.; Ranka, S. Machine Learning Approaches for Predicting High Cost High Need Patient Expenditures in Health Care. Biomed. Eng. Online 2018, 17, 131. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shyamala Devi, M.; Swathi, P.; Purushotham Reddy, M.; Deepak Varma, V.; Praveen Kumar Reddy, A.; Vivekanandan, S.; Moorthy, P. Linear and Ensembling Regression Based Health Cost Insurance Prediction Using Machine Learning. Smart Innov. Syst. Technol. 2021, 224, 495–503. [Google Scholar] [CrossRef]
- Omar, T.; Zohdy, M.; Rrushi, J. Clustering Application for Data-Driven Prediction of Health Insurance Premiums for People of Different Ages. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 10–12 January 2021. [Google Scholar] [CrossRef]
- Sailaja, N.V.; Karakavalasa, M.; Katkam, M.; Devipriya, M.; Sreeja, M.; Vasundhara, D.N. Hybrid Regression Model for Medical Insurance Cost Prediction and Recommendation. In Proceedings of the 2021 IEEE International Conference on Intelligent Systems, Smart and Green Technologies (ICISSGT), Visakhapatnam, India, 13–14 November 2021; pp. 93–98. [Google Scholar] [CrossRef]
- Dutta, K.; Chandra, S.; Gourisaria, M.K.; GM, H. A Data Mining Based Target Regression-Oriented Approach to Modelling of Health Insurance Claims. In Proceedings of the 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 8–10 April 2021; pp. 1168–1175. [Google Scholar] [CrossRef]
Literature Classification | Stage 1 | Stage 2 | Stage 3 | Stage 4 | Breakdown |
---|---|---|---|---|---|
Health insurance prediction | 54 | 38 | 23 | 10 | 22.04% |
Premium calculation | 53 | 37 | 22 | 10 | 21.63% |
Machine learning | 77 | 54 | 32 | 15 | 31.43% |
Artificial intelligence | 61 | 43 | 26 | 12 | 24.90% |
Neural networks | 245 | 172 | 103 | 46 |
Region | Age | BMI | Children | Charges |
---|---|---|---|---|
Northeast | 39.268519 | 29.173503 | 1.046296 | 13,406.384516 |
Northwest | 39.196923 | 29.199785 | 1.147692 | 12,417.575374 |
Southeast | 38.939560 | 33.355989 | 1.049451 | 14,735.411438 |
Southwest | 39.455385 | 30.596615 | 1.141538 | 12,346.937377 |
Evaluation Metrics | Value |
---|---|
RMSE | 0.499 |
MSE | 0.24908696 |
MAE | 0.3445451 |
r2 | 0.7509130368819994 |
adjusted r2 | 0.7494136420701529 |
Layer (Type) | Output Shape | Number of Parameters |
---|---|---|
Dense (dense) | (None, 50) | 450 |
activation (activation) | (None, 50) | 0 |
dense_1 | (None, 150) | 7650 |
activation_1 (activation) | (None, 150) | 0 |
dense_2 (dense) | (None, 150) | 22,650 |
activation_2 (activation) | (None, 50) | 0 |
dense_3 (dense) | (None, 50) | 7550 |
activation_3 (activation) | (None, 50) | 0 |
dense_4 (dense) | (None, 1) | 51 |
Evaluation Metrics | ANN Value | Linear Value |
---|---|---|
RMSE | 0.27 | 0.499 |
MSE | 0.07275635 | 0.24908696 |
MAE | 0.1432731 | 0.3445451 |
r2 | 0.9272436488919791 | 0.7509130368819994 |
adjusted r2 | 0.9268056874105162 | 0.7494136420701529 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kaushik, K.; Bhardwaj, A.; Dwivedi, A.D.; Singh, R. Machine Learning-Based Regression Framework to Predict Health Insurance Premiums. Int. J. Environ. Res. Public Health 2022, 19, 7898. https://doi.org/10.3390/ijerph19137898
Kaushik K, Bhardwaj A, Dwivedi AD, Singh R. Machine Learning-Based Regression Framework to Predict Health Insurance Premiums. International Journal of Environmental Research and Public Health. 2022; 19(13):7898. https://doi.org/10.3390/ijerph19137898
Chicago/Turabian StyleKaushik, Keshav, Akashdeep Bhardwaj, Ashutosh Dhar Dwivedi, and Rajani Singh. 2022. "Machine Learning-Based Regression Framework to Predict Health Insurance Premiums" International Journal of Environmental Research and Public Health 19, no. 13: 7898. https://doi.org/10.3390/ijerph19137898
APA StyleKaushik, K., Bhardwaj, A., Dwivedi, A. D., & Singh, R. (2022). Machine Learning-Based Regression Framework to Predict Health Insurance Premiums. International Journal of Environmental Research and Public Health, 19(13), 7898. https://doi.org/10.3390/ijerph19137898