Identifying Self-healing Contracts with Machine Learning: Analyzing the Accuracy and Disaggregation of 30 to 90 Day Debts

Mahboob Sadeghi ¹ ( Department of Management, NT.C., Islamic Azad University, Tehran, Iran )
Ali Saeedi ² ( Department of Management, NT.C., Islamic Azad University, Tehran, Iran )
Alireza Heidarzadeh Hanzaei ³ ( Department of Management, North Tehran Branch, Islamic Azad University, Tehran, Iran )

Submited date : 2025-03-25 Accepted date : 2025-09-06

Keywords: Non-performing Loan Collection Forecasting, Explainable Artificial Intelligence, Machine Learning, SHAP, Feature Analysis. ,

Abstract :

Forecasting the collection of non-current receivables is one of the key challenges in the financial management of financial and credit institutions. This issue not only affects the financial stability and soundness of banks, but also directly affects their ability to manage risk and determine effective credit strategies. The present study uses artificial intelligence-based methods to provide a forecasting model to determine the probability of collection of non-current receivables in contracts with debt due between 30 and 90 days. In this study, machine learning algorithms, including decision trees, random forests, and model clarification analyses, especially SHAP (SHapley Additive exPlanations), are used to analyze financial data and predict the status of receivables collection. The results of the analysis show that machine learning models are able to distinguish and isolate self-healing contracts from other contracts with considerable accuracy in the future. The findings show that machine learning models have a high power in distinguishing self-healing contracts from other cases. The SHAP tool has also played a key role in analyzing the features that affect the prediction. This approach can be effectively used in improving banks' credit risk management solutions.

References:

Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and machine learning. Retrieved from https://fairmlbook.org
Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., ... & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
Basel Committee on Banking Supervision. (2017). Guidance on the application of the core principles for effective banking supervision to the regulation and supervision of institutions relevant to financial inclusion. Bank for International Settlements. Retrieved from https://www.bis.org
Bertsimas, D., Dunn, J., & Pauphilet, J. (2020). Predicting bankruptcies with machine learning. Management Science, 66(12), 5461–5480. https://doi.org/10.1287/mnsc.2019.3480
BIS. (2020). Credit risk and credit risk mitigation. Bank for International Settlements. Retrieved from https://www.bis.org
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Brownlee, J. (2020). Imbalanced classification with Python: Better metrics, balance skewed classes, cost-sensitive learning. Machine Learning Mastery.
Brynjolfsson, E., & McAfee, A. (2017). Machine, platform, crowd: Harnessing our digital future. W.W. Norton & Company.
Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1(2), 223–236. https://doi.org/10.1080/713665670
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. https://arxiv.org/abs/1702.08608
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57(2), 357–384. https://doi.org/10.2307/1912559
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239.
Hassanzadeh, Hossein and Habibi, Milad (2010), Dissection of Overdue Claims. Banking Journal. 4(14).
Khandani, A. E., Kim, A. J., & Lo, A. W. (2010). Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34(11), 2767–2787. https://doi.org/10.1016/j.jbankfin.2010.06.001
Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030
Lipton, Z. C. (2018). The mythos of model interpretability. Communications of the ACM, 61(10), 36–43. https://doi.org/10.1145/3233231
Liu, Y., Lei, Z., Liu, Z., & Yang, J. (2020). Predicting consumer loan default with deep learning: A comparative study. Expert Systems with Applications, 144, 113092. https://doi.org/10.1016/j.eswa.2019.113092
Louzis, D. P., Vouldis, A. T., & Metaxas, V. L. (2012). Macroeconomic and bank-specific determinants of non-performing loans in Greece: A comparative study of mortgage, business and consumer loan portfolios. Journal of Banking & Finance, 36(4), 1012–1027. https://doi.org/10.1016/j.jbankfin.2011.10.012
Messai, A. S., & Jouini, F. (2013). Micro and macro determinants of non-performing loans. International Journal of Economics and Financial Issues, 3(4), 852–860.
Reddy, Y. V. (2002). Non-performing loans–some issues. Banking Sector Reforms in India, 68–78.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. https://doi.org/10.1145/2939672.2939778
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215. https://doi.org/10.1038/s42256-019-0048-x
Serrano-Cinca, C., Gutiérrez-Nieto, B., & López-Palacios, L. (2015). Determinants of default in P2P lending. PLOS ONE, 10(10), e0139427. https://doi.org/10.1371/journal.pone.0139427
Taleb, N. N. (2007). The black swan: The impact of the highly improbable. Random House.
Tsay, R. S. (2010). Analysis of financial time series (Vol. 543). John Wiley & Sons.
Zhang, D., Zhou, L., & Du, J. (2020). Exploring the nonlinear relationship between credit risk and firm performance. Journal of Business Research, 109, 244–256. https://doi.org/10.1016/j.jbusres.2019.11.064
Zhang, X., Wang, H., & Zhu, J. (2021). An explainable machine learning model for predicting loan default risk. Applied Intelligence, 51, 4432–4445. https://doi.org/10.1007/s10489-020-02020-3