Employing unsupervised learning to detect fraudulent claims in auto insurance (isolation forest)
Subject Areas :
Management Accounting
farbod khanizadeh
1
,
Farzan Khamesian
2
,
Maryam Esna-Ashari
3
1 - Assistant Professor, Property and Casualty Insurance Research Group, Insurance Research Group, Tehran, Iran
2 - Assistant Professor, General Insurance Research Group, Insurance Research Group, Tehran, Iran
3 - Assistant Professor, Property and Casualty Insurance Research Group, Insurance Research Group, Tehran, Iran. (Corresponding author)
Received: 2022-02-03
Accepted : 2022-07-23
Published : 2022-08-23
Keywords:
Unsupervised learning,
Isolation forest,
Fraud detection,
Auto insurance,
Abstract :
For insurance companies, fraud detection strategies are of significant importance. Lack of such a plan to prevent insurance fraud and making payments quickly to insured in order to compensate for losses will lead to customer satisfaction and increase companies’ portfolio in short term. However in the long run, it will have dire consequences for the insurance industry. In other words, the cost of fraudulent claims would be transferred indirectly to insured in the form of a rise in premiums. The purpose of this study is to provide insurers with a mechanism to detect fraudulent claims. This goal is achieved through an unsupervised algorithm to detect anomalies in the data set. The use of this algorithm, as it is an ensemble learning, increases the accuracy in detecting suspicious cases and reduces false positives. According to the results, the damage to the culprit, the type and use of the vehicle, and the sex of the victim are among the most important indicators in the detection of fraudulent cases.
References:
اصغری اسکوئی، محمدرضا؛ خانیزاده، فربد و بهادر، آزاده، (1399)، کاربرد دادهکاوی با استفاده از الگوریتمهای یادگیری ماشین برای بررسی تاثیر ویژگیهای خودرو در پیشبینی ریسک خسارت مالی در رشته بیمه شخص ثالث، فصلنامه علمی-پژوهشی پژوهشنامه بیمه، 35(1)، 34-65.
جوادیان کوتنائی، اکبر؛ عباسعلی پورآقاجان سرحمامی، عباسعلی و حسینی شیروانی، میرسعید (1399)، ارائه مدل شناسایی تقلب مالیاتی بر مبنای ترکیب الگوریتم درخت تصمیم ID3 بهبود یافته و شبکههای عصبی پرسپترون چندلایه، نشریـه علمـی حسابداری مدیریت، 46 (13)، 53-70
تاراسی، مجتبی؛ بنی طالبی دهکردی، بهاره و زمانی، بهزاد (1398)، پیش بینی گزارشگری مالی متقلبانه از طریق شبکه عصبی مصنوعی(ANN)، نشریـه علمـی حسابداری مدیریت، 40 (12)، 63-79
Abe, N., Zadrozny, B., & Langford, J., (2006), Outlier detection by active learning, In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 504-509.
Alghushairy, O., Alsini, R., Soule, T., & Ma, X. (2020). A review of local outlier factor algorithms for outlier detection in big data streams. Big Data and Cognitive Computing, 5(1), 1.
Artís, M., Ayuso, M., & Guillén, M., (2002), Detection of automobile insurance fraud with discrete choice models and misclassified claims, Journal of Risk and Insurance, 69(3), 325-340.
Aziz, R. M., Baluch, M. F., Patel, S., & Ganie, A. H. (2022). LGBM: a machine learning approach for Ethereum fraud detection. International Journal of Information Technology, 1-11.
Belhadji, B., & Dionne, G., (1997), Development of an Expert System for Automatic Detection of Automobile Insurance Fraud (No. 97-06), Ecole des Hautes Etudes Commerciales de Montreal-Chaire de gestion des risques.
Brockett, P. L., Xia, X., & Derrig, R. A., (1998), Using Kohonen's self-organizing feature map to uncover automobile bodily injury claims fraud, Journal of Risk and Insurance, 245-274.
Chandola, V., Banerjee, A., & Kumar, V., (2009), Anomaly Detection: A Survey, ACM Computing Surveys, 41(3), 1-58.
Cummins, J. D., & Tennyson, S., (1992), Controlling automobile insurance costs, Journal of Economic Perspectives, 6(2), 95-115.
Derrig, R. A., & Ostaszewski, K. M., (1995), Fuzzy techniques of pattern recognition in risk and claim classification, Journal of Risk and Insurance, 447-482.
Gopdarzi, A., & Janatbabaei, S., (2017), Evaluation of Three Data Mining Algorithms (Decision Tree, Naive Bayes, Logistic Regression) in Auto Insurance Fraud Detection, Insurance Research, 1(2), 61-80.
Gupta, R. Y., Mudigonda, S. S., Baruah, P. K., & Kandala, P. K. (2021). Markov model with machine learning integration for fraud detection in health insurance. arXiv preprint arXiv:2102.10978.
Hastie, T., Tibshirani, R., & Friedman, J., (2009), Unsupervised learning, In The elements of statistical learning (pp. 485-585), Springer, New York.
Khanizadeh, F., Khamesian, F., & Bahiraie, A., (2021), Customer Segmentation for Life Insurance in Iran Using K-means Clustering, International Journal of Nonlinear Analysis and Applications, 12(Special Issue), 633-642.
Lison, P., (2015), An introduction to machine learning, Language Technology Group (LTG), 1(35), 1-35.
Liu, X., Yang, J. B., & Xu, D. L., (2020), Fraud detection in automobile insurance claims: A
statistical review, In Developments of Artificial Intelligence Technologies in Computation and Robotics: Proceedings of the 14th International FLINS Conference (FLINS 2020), 1003-1012.
Obodoekwe, N., & Haar, D. T. V. D. (2019, February). A comparison of machine learning methods applicable to healthcare claims fraud detection. In International Conference on Information Technology & Systems (pp. 548-557). Springer, Cham.
Pang, G., Shen, C., Cao, L., & Hengel, A. V. D. (2021). Deep learning for anomaly detection: A review. ACM Computing Surveys (CSUR), 54(2), 1-38.
Polhul, T., & Yarovyi, A., (2019), Development of a method for fraud detection in heterogeneous data during installation of mobile applications, Eastern-European Journal of Enterprise Technologies, 1(2), 65-75.
Ruff, L., Vandermeulen, R. A., Görnitz, N., Binder, A., Müller, E., Müller, K. R., & Kloft, M. (2019). Deep semi-supervised anomaly detection. arXiv preprint arXiv:1906.02694.
Rukhsar, L., Bangyal, W. H., Nisar, K., & Nisar, S. (2022). Prediction of insurance fraud detection using machine learning algorithms. Mehran University Research Journal Of Engineering & Technology, 41(1), 33-40
Severino, M. K., & Peng, Y. (2021). Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata. Machine Learning with Applications, 5, 100074.
Smiti, A. (2020). A critical overview of outlier detection methods. Computer Science Review, 38, 100306.
Subudhi, S., & Panigrahi, S., (2018), Detection of automobile insurance fraud using feature selection and data mining techniques, International Journal of Rough Sets and Data Analysis (IJRSDA), 5(3), 1-20.
Tiwari, P., Mehta, S., Sakhuja, N., Kumar, J., & Singh, A. K. (2021). Credit Card Fraud Detection using Machine Learning: A Study. arXiv preprint arXiv:2108.10005.
Wang, H., Bah, M. J., & Hammad, M. (2019). Progress in outlier detection techniques: A survey. Ieee Access, 7, 107964-108000.
Weisberg, H. I., & Derrig, R. A., (1998), Quantitative methods for detecting fraudulent automobile bodily injury claims. Risques, 35(July–September), 75-99.
_||_