Finding all Redacts in Financial Information Systems Based on Neighbourhood Rough Set Theory for Finance Data with Decision Makers Point of View
Subject Areas : Financial Mathematics
Seyed Majid Alavi
1
*
,
Sodabeh Amin
2
,
Parvaneh Mansouri
3
,
Abolfazl Saidofar
4
1 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
2 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
3 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
4 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
Keywords: Neighborhood rough set theory, Feature selection, Financial Information Systems.,
Abstract :
The Neighborhood Rough Set (NRST) method is a valuable approach for selecting a subset of features from a complete dataset, enabling us to preserve the essential information that the entire feature set provides. In financial datasets, which often contain high-dimensional input features, effective feature selection techniques are crucial to identify the features that yield the most predictable results. In this work, we use neighborhood concepts to discover data dependencies and reduce the number of features in a financial dataset based solely on the data itself, without relying on additional information. This process also includes removing extra features. To facilitate a simple algorithm, we use the properties of neighbourhood rough sets to formulate a Binary Integer Linear Programming (BILP) model. Optimal solutions to these problems are obtained using genetic algorithms. Our approach allows for feature reduction from minimum to maximum cardinality. We demonstrate the efficiency of our proposed method compared to other techniques through various tables showing the results on several benchmark datasets characterized by unbalanced class distributions. The financial dataset used in the present study is taken from the UCI Machine Learning Repository.
[1] Abdel-Basset, M., Abdle-Fatah, L., Sangaiah, A.K., An improved Lévy based whale optimization algo-rithm for bandwidth-efficient virtual machine placement in cloud computing environment, Cluster Com-put, 2019; 22(4): 8319–8334. Doi.org/10.1007/s10586-018-1769-z.
[2] Ahmed, S., Ghosh, K.K., Singh, P.K., Geem, Z.W., Sarkar, R., Hybrid of Harmony Search Algorithm and Ring Theory-Based Evolutionary Algorithm for Feature Selection, IEEE Access, 2020; 8:102629–45. Doi: 10.1109/ACCESS.2020.2997005.
[3] Alazzam, H., Sharieh, A., Sabri, K.E., A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer, Expert Syst Appl, 2020; 148: 113249. Doi: 10.1016/j.eswa.2020.113249.
[4] Alavi, S. M, A., Khazravi. N., Evaluation and ranking of fuzzy sets under equivalence fuzzy relations as α−certainty and β−possibility, Expert Systems with Applications, 2024; 248:(123175): 0957-4174, Doi :10.1016/j.eswa.2024.123175.
[5] Aljarah, I., Faris, H., Mirjalili, S., Optimizing connection weights in neural networks using the whale optimization algorithm, Soft Comput, 2018: 22: 1433–7479. Doi: 10.1007/s00500-016-2442-1.
[6] Behrooz, S., Ghomi, R., Mehrazin, A., Shoorvarzi., M. Developing Financial Distress Prediction Mod-els Based on Imbalanced Dataset , Random Under sampling and Clustering Based Under sampling Approaches, 2024; 9(3):737–62. Doi: 10.22034/amfa.2024.2189537.1689.
[7] Chen, H., Li, T., Fan, X., Luo, C., Feature selection for imbalanced data based on neighborhood rough sets, Inf Sci (Ny,. 2019: 483. Doi: 10.1016/j.ins.2019.01.073.
[8] Coello, C.A., Theoretical and numerical constraint-handling techniques used with evolutionary algorithms: a survey of the state of the art, Comput Methods Appl Mech Eng, 2002; 191(11):1245–87. Doi:10.1016/S0045-7825 (01)00323-4.
[9] Dai, J., Hu, H., Wu, W.Z., Qian, Y., Huang, D., Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets, IEEE Trans Fuzzy Syst, 2018; 26(4): 2174–87. Doi:101109/TFUZZ.2017.2768044.
[10] Dash, M., Liu, H., Feature selection for classification, Intell Data Anal, 1997; 1(1): 131–56. Doi:10.1016/ S1088-467X (97)00008-5
[11] Dhiman, G., Oliva, D., Kaur, A., Singh, K.K, Vimal, S., Sharma, A., et al., BEPO: A novel binary emperor penguin optimizer for automatic feature selection, Knowledge-Based Syst, 2021; 211: Doi:10.1016/ j.knosys. 2020.106560
[12] Dhiman, G., Kumar, V., Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems, Knowledge-Based Syst, 2019; 165: 169–96. Doi: 10.1016/j.knosys.2018.11. 024.
[13] Dua, D., Graff, C., {UCI} Machine Learning Repository. 2017: Available from: http://archive.ics.uci.edu/ ml. Doi: 10.24432/C56C7D
[14] Elango, S., Chandran, S., Mahalakshmi, S., Rough set-based feature selection for credit risk prediction using weight-adjusted boosting ensemble method, Soft Comput, 2020; 24. Gpi: 10.1007/s00500-020-05125-1.
[15] Eskandar, H., Sadollah, A., Bahreininejad, A., Hamdi, M., Water cycle algorithm – A novel metaheuristic optimization method for solving constrained engineering optimization problems, Comput Struct, 2012; 110–111: 151–66. Doi:10.1016/j.compstruc.2012.07.010.
[16] Ghaemi, M., Feizi-Derakhshi, M.R., Forest optimization algorithm, Expert Systems with Applications, 2014; 41(15): 6676–6687. Doi: 10.1016/j.eswa.2014.03.034.
[17] Guoyin, W., Xiao, M., Haiyun Y., Monotonic uncertainty measures for attribute reduction in proba-bilistic rough set model, International Journal of Approximate Reasoning, 2015; 59: 41–67. Doi:10.10 16/j.ijar. 2015.01.002.
[18] Han, J., Kamber, M., Pei, J., Data Mining: Concepts and Techniques , 3rd ed. Waltham, Mass: Morgan Kaufmann Publishers; 2012.
[19] Holland, J.H., Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press; 1992.
[20] Hussien, A.G., Abualigah, L., Abu Zitar, R., Hashim, F.A., Amin, M., Saber, A., et al., Recent Advances in Harris Hawks Optimization: A Comparative Study and Applications, Electronics, 2022; 11(12): Doi:10.3390/ electronics11121888.
[21] Hu, M., Tsang, E.C.C., Guo, Y., Chen, D., Xu, W., A novel approach to attribute reduction based on weighted neighborhood rough sets, Knowledge-Based Systems, 2021; 220: 106908. Doi:10.1016/ j.knosys. 2021.106908.
[22] Jensen, R., Tuson, A., Shen, Q., Finding rough and fuzzy-rough set reducts with SAT, Inf Sci (Ny), 2014; 255: 100–120. Doi: 10.1016/j.ins.2013.08.050.
[23] Kennedy, J., Eberhart, R., Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks,1995; 4: 1942–1948 Doi: 10.1109/ICNN.1995.488968.
[24] Khushaba, R.N., Al-Ani, A., AlSukker, A., Al-Jumaily, A., A Combined Ant Colony and Differential Evolution Feature Selection Algorithm, Berlin, Heidelberg: Springer Berlin Heidelberg, 2008; 1–12. Doi: 10.1007/978-3-540-87527-7-1.
[25] Ranjini, S.K.S., Murugan, S., Memory Based Hybrid Dragonfly Algorithm for Numerical Optimization Problems, Expert Systems with Applications, 2017; 83(C): 63–78. Doi: 10.1016/j.eswa.2017.04.031.
[26] Liang, B., Zhang, H., Lu, Z., Zhang, Z., Indistinguishable Element-Pair Attribute Reduction and Its Incremental Approach, Mathematical Problems in Engineering, 2022; 1–16. Doi: 10.1155/2022/4823216.
[27] Liu, J., Hu, Q., Yu, D., A weighted rough set based method developed for class imbalance learning, Inf Sci (Ny), 2008; 178(4): 1235–1256. Doi: 10.1016/j.ins.2007.09.036.
[28] Liu, S., Zhang, J., Xiang, Y., Zhou, W., Xiang, D., A Study of Data Pre-processing Techniques for Imbalanced Biomedical Data Classification, Int. J. Bioinformatics Res. Appl, 2020; 16(3): 290–318. Doi:10.1504/ijbra. 2020.109103.
[29] Mafarja, M., Mirjalili, S., Whale optimization approaches for wrapper feature selection, Appl Soft Comput, 2018; 62: 441–453. Doi: 10.1016/j.asoc.2017.11.001.
[30] Maksood, F., Achuthan, G., Analysis of Data Mining Techniques and its Applications, Int J Comput Appl, 2016; 140: 6–14. Doi: 10.5120/ijca2016909411.
[31] Murali, V., Fuzzy equivalence relations, Fuzzy Sets Syst, 1989; 30(2): 155–163. Doi: 10.1016/0165-0114(89)90017-1.
[32] Nahari, J., Qala, A., Rezaei, N., Aghdam, Y., Abdi, R., The Performance of Machine Learning Techniques in Detecting Financial Frauds, Advances in Mathematical Finance & Applications Comparing, 2024: 9(3): 1006–1023. Doi :10.71716/amfa.2024.22101813.
[33] Pandey, A. C., Kulhari, A., Mittal, H., Tripathi, A. K., Pal, R., Improved exponential cuckoo search method for sentiment analysis, Multimedia Tools and Applications, 2022; 82(16), 23979–24029. Doi:10.1007/s11042-022-14229-5.
[34] Pawlak, Z., Rough set approach to knowledge-based decision support, Eur J Oper Res, 1997; 99(1): 48–57. Doi: 10.1016/S0377-2217(96)00381-7.
[35] Pawlak, Z., Rough classification, Int J Man Mach Stud, 1984; 20(5): 469–483. Doi:10.1016/S0020-7373(84)80022-X.
[36] Pawlak, Z., Rough sets and intelligent data analysis, Inf Sci (Ny), 2002; 147(1): 1–12. Doi: 10.1016/S0020-0255(02)00197-4.
[37] Pieta, P., Szmuc, T., Kluza, K., Comparative Overview of Rough Set Toolkit Systems for Data Analy-sis, MATEC Web Conf, 2019; 252: 3019. Doi: 10.1051/matecconf/201925203019.
[38] Hu, Q., Yu, D., Liu, J., Wu, N., Neighborhood rough set based heterogeneous feature subset selection, Inf Sci (Ny), 2008; 178(18): 3577–3594. Doi: 10.1016/j.ins.2008.05.024.
[39] Rashedi, E., Nezamabadi-pour, H., Saryazdi, S., GSA: A Gravitational Search Algorithm, Inf Sci (Ny), 2009; 179(13): 2232–2248. Doi: 10.1016/j.ins.2009.03.004.
[40] Sağlam, F., Sözen, M., Cengiz, M.A., Optimization Based Under sampling for Imbalanced Classes, Adıyaman University Journal of Science, 2021; 11(2): 385-409. Doi:10.37094/adyujsci.884120.
[41] Sepehri, A., Ghodrati, H., Jabbari, H., Panahian, H., Making Decision on Selection of Optimal Stock Portfolio Employing Meta Heuristic Algorithms for Multi-Objective Functions Subject to Real-Life Con-straints, Advances in Mathematical Finance & Applications, 2023; 8(2): 645–666. Doi: 10.22034/ AMFA.2021. 1915292.1525.
[42] Shekhawat, S.S., Sharma, H., Kumar, S., Nayyar, A., Qureshi, B., bSSA: Binary Salp Swarm Algorithm With Hybrid Data Transformation for Feature Selection, IEEE Access, 2021; 9: 14867–14882. Doi:10.1109/ACCESS.2020.3047773
[43] Shareef, H., Ibrahim, A.A., Mutlag, A.H., Lightning Search Algorithm, Applied Soft Computing, 2015; 36(C): 315–333. Doi: 10.1016/j.asoc.2015.07.028.
[44] Wang, G.G., Deb, S., Cui, Z., Monarch Butterfly Optimization, Neural Computing and Applications, 2019; 31(7): 1995–2014. DOI: 10.1007/s00521-017-3210-z.
[45] Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R., Feature Selection based on Rough Sets and Particle Swarm Optimization, Pattern Recognition Letters, 2007; 28: 459–471. Doi: 10.1016/j.patrec.2006.09.003.
[46] Zhang, Y., Wang, Y., Research on Classification Model based on Neighborhood Rough Set and Evidence Theory, Journal of Physics: Conference Series, 202: 1746:012018. Doi: 10.1088/1742-6596/1746/1/012018.