Using a New Data Mining Method for Automobile Insurance Fraud Detection: A Case Study by a Real Data From an Iranian Insurance Company
Subject Areas : International Journal of Mathematical Modelling & Computations
1 - Insurance Research CenterTehran, Iran
Keywords: Fraud detection, Imbalanced data, XGBoost algorithm, Random Forest algorithm,
Abstract :
The issue of car insurance fraud is one of the most important issues for insurance companies because it can impose a lot of financial losses on the insurance company. Therefore, timely and early detection of a suspected case can greatly prevent this loss. In the last decade, a lot of studies has been done using data mining techniques in this regard. In this article, we first examine the challenge of imbalanced data, and then, after fixing it, use a very new algorithm introduced in the field of fraud discovery, called XGBoost, for a real data set. Finally, we compare this method with an older method Random Forest algorithm and we will see that the new method works well.
Nian, Ke, Haofan Zhang, Aditya Tayal, Thomas Coleman, and Yuying Li. "Auto insurance fraud detection using unsupervised spectral ranking for anomaly." The Journal of Finance and Data Science 2, no. 1 (2016): 58-75.
Kirlidog, Melih, and Cuneyt Asuk. "A fraud detection approach with data mining in health insurance." Procedia-Social and Behavioral Sciences 62 (2012): 989-994.
Bhowmik, Rekha. "Detecting auto insurance fraud by data mining techniques." Journal of Emerging Trends in Computing and Information Sciences 2, no. 4 (2011): 156-162.
Hastie, Trevor, Robert Tibshirani, Jerome H. Friedman, and Jerome H. Friedman. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. New York: springer, 2009.
Khanizadeh, Farbod, Farzan Khamesian, and Maryam Esna-Ashari. "Employing unsupervised learning to detect fraudulent claims in auto insurance (isolation forest)." Journal of Management Accounting 15, no. 53, (2022): 141-153.
Firoozi, Mahdi, Shakoori, Morteza, Kazemi, Leila and Zahedi, Sahar. “Detecting fraud in car insurance using data mining methods.” Iranian Journal of Insurance Research no. 3, (2011): 103-128.
Goodarzi, Atoosa and Jannatbabaei, Sajad. “Evaluation of decision tree, Naive Bayes and logistic regression algorithms in detecting car insurance frauds.” Insurance Research no. 2, (2017): 61-80.
Goleiji, Leila, and M. Tarokh. "Identification of influential features and fraud detection in the Insurance Industry using the data mining techniques (Case study: automobile’s body insurance)." Majlesi J Multimed Process 4 (2015): 1-5.
Khanizadeh, Farbod, Maryam Esna-Ashari, Farzan Khamesian, and Azadeh Bahador. "Target replacement, a new approach to increase the performance of fraud detection system in auto insurance utilizing supervising learning." Journal of Quality Engineering and Management 11, no. 4 (2022): 413-428.
Gepp, Adrian, J. Holton Wilson, Kuldeep Kumar, and Sukanto Bhattacharya. "A comparative analysis of decision trees vis-a-vis other computational data mining techniques in automotive insurance fraud detection." Journal of data science 10, no. 3 (2012): 537-561.
Prasasti, Iffa Maula Nur, Arian Dhini, and Enrico Laoh. "Automobile insurance fraud detection using supervised classifiers." In 2020 International Workshop on Big Data and Information Security (IWBIS), pp. 47-52. IEEE, 2020.
Na Bangchang, Kannat, Sangdao Wongsai, and Teerawat Simmachan. "Application of Data Mining Techniques in Automobile Insurance Fraud Detection." In Proceedings of the 2023 6th International Conference on Mathematics and Statistics, pp. 48-55. 2023.
Simmachan, Teerawat, Weerapong Manopa, Pailin Neamhom, Achiraya Poothong, and Wikanda Phaphan. "Detecting fraudulent claims in automobile insurance policies by data mining techniques." Thailand Statistician 21, no. 3 (2023): 552-568.
Salmi, Mabrouka, and Dalia Atif. "Using a data mining approach to detect automobile insurance fraud." In International Conference on Soft Computing and Pattern Recognition, pp. 55-66. Cham: Springer International Publishing, 2021.
Hanafy, Mohamed, and Ruixing Ming. "Machine learning approaches for auto insurance big data." Risks 9, no. 2 (2021): 42.
Averro, Nathanael Theovanny, Hendri Murfi, and Gianinna Ardaneswari. "The Imbalance Data Handling of XGBoost in Insurance Fraud Detection." In DATA, pp. 460-467. 2023.
Okagbue, Hilary I., and O. Oyewole. "Prediction of automobile insurance fraud claims using machine learning." The Scientific Temper 14, no. 03 (2023): 756-762.
Abdallah, Aisha, Mohd Aizaini Maarof, and Anazida Zainal. "Fraud detection system: A survey." Journal of Network and Computer Applications 68 (2016): 90-113.
Menardi, Giovanna, and Nicola Torelli. "Training and assessing classification rules with imbalanced data." Data mining and knowledge discovery 28 (2014): 92-122.
Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785-794. 2016.