Finding all Redacts in Financial Information Systems Based on Neighbourhood Rough Set Theory for Finance Data with Decision Makers Point of View
Subject Areas : Financial Mathematics
Seyed Majid Alavi
1
*
,
Sodabeh Amin
2
,
Parvaneh Mansouri
3
,
Abolfazl Saidofar
4
1 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
2 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
3 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
4 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
Keywords: Neighborhood rough set theory, Feature selection, Financial Information Systems.,
Abstract :
The Neighborhood Rough Set (NRST) method is a valuable approach for selecting a subset of features from a complete dataset, enabling us to preserve the essential information that the entire feature set provides. In financial datasets, which often contain high-dimensional input features, effective feature selection techniques are crucial to identify the features that yield the most predictable results. In this work, we use neighborhood concepts to discover data dependencies and reduce the number of features in a financial dataset based solely on the data itself, without relying on additional information. This process also includes removing extra features. To facilitate a simple algorithm, we use the properties of neighbourhood rough sets to formulate a Binary Integer Linear Programming (BILP) model. Optimal solutions to these problems are obtained using genetic algorithms. Our approach allows for feature reduction from minimum to maximum cardinality. We demonstrate the efficiency of our proposed method compared to other techniques through various tables showing the results on several benchmark datasets characterized by unbalanced class distributions. The financial dataset used in the present study is taken from the UCI Machine Learning Repository.
[1] Abdel-Basset, M., Abdle-Fatah, L., Sangaiah, A.K., An improved Lévy based whale optimization algo-rithm for bandwidth-efficient virtual machine placement in cloud computing environment, Cluster Com-put, 2019; 22(4): 8319–8334. Doi.org/10.1007/s10586-018-1769-z.
[2] Ahmed, S., Ghosh, K.K., Singh, P.K., Geem, Z.W., Sarkar, R., Hybrid of Harmony Search Algorithm and Ring Theory-Based Evolutionary Algorithm for Feature Selection, IEEE Access, 2020; 8:102629–45. Doi: 10.1109/ACCESS.2020.2997005.
[3] Alazzam, H., Sharieh, A., Sabri, K.E., A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer, Expert Syst Appl, 2020; 148: 113249. Doi: 10.1016/j.eswa.2020.113249.
[4] Alavi, S. M, A., Khazravi. N., Evaluation and ranking of fuzzy sets under equivalence fuzzy relations as α−certainty and β−possibility, Expert Systems with Applications, 2024; 248:(123175): 0957-4174, Doi :10.1016/j.eswa.2024.123175.
[5] Aljarah, I., Faris, H., Mirjalili, S., Optimizing connection weights in neural networks using the whale optimization algorithm, Soft Comput, 2018: 22: 1433–7479. Doi: 10.1007/s00500-016-2442-1.
[6] Behrooz, S., Ghomi, R., Mehrazin, A., Shoorvarzi., M. Developing Financial Distress Prediction Mod-els Based on Imbalanced Dataset , Random Under sampling and Clustering Based Under sampling Approaches, 2024; 9(3):737–62. Doi: 10.22034/amfa.2024.2189537.1689.
[7] Chen, H., Li, T., Fan, X., Luo, C., Feature selection for imbalanced data based on neighborhood rough sets, Inf Sci (Ny,. 2019: 483. Doi: 10.1016/j.ins.2019.01.073.
[8] Coello, C.A., Theoretical and numerical constraint-handling techniques used with evolutionary algorithms: a survey of the state of the art, Comput Methods Appl Mech Eng, 2002; 191(11):1245–87. Doi:10.1016/S0045-7825 (01)00323-4.
[9] Dai, J., Hu, H., Wu, W.Z., Qian, Y., Huang, D., Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets, IEEE Trans Fuzzy Syst, 2018; 26(4): 2174–87. Doi:101109/TFUZZ.2017.2768044.
[10] Dash, M., Liu, H., Feature selection for classification, Intell Data Anal, 1997; 1(1): 131–56. Doi:10.1016/ S1088-467X (97)00008-5
[11] Dhiman, G., Oliva, D., Kaur, A., Singh, K.K, Vimal, S., Sharma, A., et al., BEPO: A novel binary emperor penguin optimizer for automatic feature selection, Knowledge-Based Syst, 2021; 211: Doi:10.1016/ j.knosys. 2020.106560
[12] Dhiman, G., Kumar, V., Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems, Knowledge-Based Syst, 2019; 165: 169–96. Doi: 10.1016/j.knosys.2018.11. 024.
[13] Dua, D., Graff, C., {UCI} Machine Learning Repository. 2017: Available from: http://archive.ics.uci.edu/ ml. Doi: 10.24432/C56C7D
[14] Elango, S., Chandran, S., Mahalakshmi, S., Rough set-based feature selection for credit risk prediction using weight-adjusted boosting ensemble method, Soft Comput, 2020; 24. Gpi: 10.1007/s00500-020-05125-1.
[15] Eskandar, H., Sadollah, A., Bahreininejad, A., Hamdi, M., Water cycle algorithm – A novel metaheuristic optimization method for solving constrained engineering optimization problems, Comput Struct, 2012; 110–111: 151–66. Doi:10.1016/j.compstruc.2012.07.010.
[16] Ghaemi, M., Feizi-Derakhshi, M.R., Forest optimization algorithm, Expert Systems with Applications, 2014; 41(15): 6676–6687. Doi: 10.1016/j.eswa.2014.03.034.
[17] Guoyin, W., Xiao, M., Haiyun Y., Monotonic uncertainty measures for attribute reduction in proba-bilistic rough set model, International Journal of Approximate Reasoning, 2015; 59: 41–67. Doi:10.10 16/j.ijar. 2015.01.002.
[18] Han, J., Kamber, M., Pei, J., Data Mining: Concepts and Techniques , 3rd ed. Waltham, Mass: Morgan Kaufmann Publishers; 2012.
[19] Holland, J.H., Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press; 1992.
[20] Hussien, A.G., Abualigah, L., Abu Zitar, R., Hashim, F.A., Amin, M., Saber, A., et al., Recent Advances in Harris Hawks Optimization: A Comparative Study and Applications, Electronics, 2022; 11(12): Doi:10.3390/ electronics11121888.
[21] Hu, M., Tsang, E.C.C., Guo, Y., Chen, D., Xu, W., A novel approach to attribute reduction based on weighted neighborhood rough sets, Knowledge-Based Systems, 2021; 220: 106908. Doi:10.1016/ j.knosys. 2021.106908.
[22] Jensen, R., Tuson, A., Shen, Q., Finding rough and fuzzy-rough set reducts with SAT, Inf Sci (Ny), 2014; 255: 100–120. Doi: 10.1016/j.ins.2013.08.050.
[23] Kennedy, J., Eberhart, R., Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks,1995; 4: 1942–1948 Doi: 10.1109/ICNN.1995.488968.
[24] Khushaba, R.N., Al-Ani, A., AlSukker, A., Al-Jumaily, A., A Combined Ant Colony and Differential Evolution Feature Selection Algorithm, Berlin, Heidelberg: Springer Berlin Heidelberg, 2008; 1–12. Doi: 10.1007/978-3-540-87527-7-1.
[25] Ranjini, S.K.S., Murugan, S., Memory Based Hybrid Dragonfly Algorithm for Numerical Optimization Problems, Expert Systems with Applications, 2017; 83(C): 63–78. Doi: 10.1016/j.eswa.2017.04.031.
[26] Liang, B., Zhang, H., Lu, Z., Zhang, Z., Indistinguishable Element-Pair Attribute Reduction and Its Incremental Approach, Mathematical Problems in Engineering, 2022; 1–16. Doi: 10.1155/2022/4823216.
[27] Liu, J., Hu, Q., Yu, D., A weighted rough set based method developed for class imbalance learning, Inf Sci (Ny), 2008; 178(4): 1235–1256. Doi: 10.1016/j.ins.2007.09.036.
[28] Liu, S., Zhang, J., Xiang, Y., Zhou, W., Xiang, D., A Study of Data Pre-processing Techniques for Imbalanced Biomedical Data Classification, Int. J. Bioinformatics Res. Appl, 2020; 16(3): 290–318. Doi:10.1504/ijbra. 2020.109103.
[29] Mafarja, M., Mirjalili, S., Whale optimization approaches for wrapper feature selection, Appl Soft Comput, 2018; 62: 441–453. Doi: 10.1016/j.asoc.2017.11.001.
[30] Maksood, F., Achuthan, G., Analysis of Data Mining Techniques and its Applications, Int J Comput Appl, 2016; 140: 6–14. Doi: 10.5120/ijca2016909411.
[31] Murali, V., Fuzzy equivalence relations, Fuzzy Sets Syst, 1989; 30(2): 155–163. Doi: 10.1016/0165-0114(89)90017-1.
[32] Nahari, J., Qala, A., Rezaei, N., Aghdam, Y., Abdi, R., The Performance of Machine Learning Techniques in Detecting Financial Frauds, Advances in Mathematical Finance & Applications Comparing, 2024: 9(3): 1006–1023. Doi :10.71716/amfa.2024.22101813.
[33] Pandey, A. C., Kulhari, A., Mittal, H., Tripathi, A. K., Pal, R., Improved exponential cuckoo search method for sentiment analysis, Multimedia Tools and Applications, 2022; 82(16), 23979–24029. Doi:10.1007/s11042-022-14229-5.
[34] Pawlak, Z., Rough set approach to knowledge-based decision support, Eur J Oper Res, 1997; 99(1): 48–57. Doi: 10.1016/S0377-2217(96)00381-7.
[35] Pawlak, Z., Rough classification, Int J Man Mach Stud, 1984; 20(5): 469–483. Doi:10.1016/S0020-7373(84)80022-X.
[36] Pawlak, Z., Rough sets and intelligent data analysis, Inf Sci (Ny), 2002; 147(1): 1–12. Doi: 10.1016/S0020-0255(02)00197-4.
[37] Pieta, P., Szmuc, T., Kluza, K., Comparative Overview of Rough Set Toolkit Systems for Data Analy-sis, MATEC Web Conf, 2019; 252: 3019. Doi: 10.1051/matecconf/201925203019.
[38] Hu, Q., Yu, D., Liu, J., Wu, N., Neighborhood rough set based heterogeneous feature subset selection, Inf Sci (Ny), 2008; 178(18): 3577–3594. Doi: 10.1016/j.ins.2008.05.024.
[39] Rashedi, E., Nezamabadi-pour, H., Saryazdi, S., GSA: A Gravitational Search Algorithm, Inf Sci (Ny), 2009; 179(13): 2232–2248. Doi: 10.1016/j.ins.2009.03.004.
[40] Sağlam, F., Sözen, M., Cengiz, M.A., Optimization Based Under sampling for Imbalanced Classes, Adıyaman University Journal of Science, 2021; 11(2): 385-409. Doi:10.37094/adyujsci.884120.
[41] Sepehri, A., Ghodrati, H., Jabbari, H., Panahian, H., Making Decision on Selection of Optimal Stock Portfolio Employing Meta Heuristic Algorithms for Multi-Objective Functions Subject to Real-Life Con-straints, Advances in Mathematical Finance & Applications, 2023; 8(2): 645–666. Doi: 10.22034/ AMFA.2021. 1915292.1525.
[42] Shekhawat, S.S., Sharma, H., Kumar, S., Nayyar, A., Qureshi, B., bSSA: Binary Salp Swarm Algorithm With Hybrid Data Transformation for Feature Selection, IEEE Access, 2021; 9: 14867–14882. Doi:10.1109/ACCESS.2020.3047773
[43] Shareef, H., Ibrahim, A.A., Mutlag, A.H., Lightning Search Algorithm, Applied Soft Computing, 2015; 36(C): 315–333. Doi: 10.1016/j.asoc.2015.07.028.
[44] Wang, G.G., Deb, S., Cui, Z., Monarch Butterfly Optimization, Neural Computing and Applications, 2019; 31(7): 1995–2014. DOI: 10.1007/s00521-017-3210-z.
[45] Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R., Feature Selection based on Rough Sets and Particle Swarm Optimization, Pattern Recognition Letters, 2007; 28: 459–471. Doi: 10.1016/j.patrec.2006.09.003.
[46] Zhang, Y., Wang, Y., Research on Classification Model based on Neighborhood Rough Set and Evidence Theory, Journal of Physics: Conference Series, 202: 1746:012018. Doi: 10.1088/1742-6596/1746/1/012018.
Case Study
Finding all Redacts in Financial Information Systems Based on Neighbourhood Rough Set Theory for Finance Data with Decision Makers Point of View
|
Soodabeh Amina, Seyed Majid Alavia,*, Parvaneh Mansouria, Abolfazl Saeidiafara a Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
|
Article Info Article history: Received 2025-05-30 Accepted 2025-08-14
Keywords: Neighborhood Rough Set Theory(NRST) Feature Selection Financial Information Systems(FIS) |
| Abstract |
The Neighborhood Rough Set (NRST) method is a valuable approach for selecting a subset of features from a complete dataset, enabling us to preserve the essential information that the entire feature set provides. In financial datasets, which often contain high-dimensional input features, effective feature selection techniques are crucial to identify the features that yield the most predictable results. In this work, we use neighborhood concepts to discover data dependencies and reduce the number of features in a financial dataset based solely on the data itself, without relying on additional information. This process also includes removing extra features. To facilitate a simple algorithm, we use the properties of neighbourhood rough sets to formulate a Binary Integer Linear Programming (BILP) model. Optimal solutions to these problems are obtained using genetic algorithms. Our approach allows for feature reduction from minimum to maximum cardinality. We demonstrate the efficiency of our proposed method compared to other techniques through various tables showing the results on several benchmark datasets characterized by unbalanced class distributions. The financial dataset used in the present study is taken from the UCI Machine Learning Repository.
|
1 Introduction
Nowadays, a broad amount of data is created every day and analyzing the data is a serious challenge. By data mining, we refer to techniques in different fields such as information technology, mathematical science, and statistical analysis [30,40]. Data mining technique are useful to analyze, comprehend, and visualize large sums of data kept at data warehouses, databases, or other types of data repositories [31]. Through data mining techniques, we can handle extensive datasets including countless number of features. The big number of features that such datasets represent creates issues for data miners as some of them might be irrelevant to the data mining techniques. These irrelevant features can degrade the quality of the results gained through data mining, which in turn lowers the opportunity to find useful knowledge out of the dataset. Feature selection is one of the approaches to handle such features [18]. The big data are utilized for accurate classification in science and engineering fields like astronomy, medicine and so on. Still, such datasets feature insignificant redundant and nosy characteristics. These features may decrease the classifier efficiency. Selecting the proper features is important to solve the issue. Thus, in many fields of study, the choice of feature has a key role to play [12]. Selection of features can be a way to lower the redundant and irrelevant features and have a higher clustering accuracy and performance [11]. In real-world applications, the distribution of classes in data collected from road traffic, medicine, credit cards, banking transactions, and the stock market may be uneven. Clearly, it is not possible to discover the hidden knowledge out of real-world datasets before developing novel and optimal techniques for feature selection in imbalanced data. Imbalanced data affects the performance of predictive models [6]. So far, various algorithms have been introduced for the selection of features in imbalanced data. In many studies, the search power of meta-heuristic algorithms has been used in algorithms designed to select features [28,41]. Particle Swarm Optimization (PSO) [23], Differential Evolution (DE) [24], Gravitational Search Algorithm (GSA) [40], Water Cycle Algorithm (WCA) [15], Forest Optimization Algorithm (FOA) [16], Cuckoo Search (CS) [33], Lightning Search Algorithm (LSA) [43], Monarch Butterfly Optimization(MBO) [44] and etc. are some of the meta-heuristics employed to solve function choice problem [41,42]. There had been a success instances of the usage of meta-heuristic algorithms for specific engineering and medical troubles like optimized connection weights with inside the neural network [5], Numerical Optimization [25], cloud computing [1], and stock market index prediction [20]. Due to the advancement of soft computing, researchers in finance, computer science, and mathematics have paid great attention to optimization research [6]. In addition, several algorithms for selecting features have been proposed to diagnose, classify, categorize, and detect patterns are available [2,3,3]. Following the introduction of rough set theory in 1981 by Pawlak [35], it has been used as a successful technique to select features in classified data. It is used in artificial intelligence and cognitive sciences in several different fields including knowledge discovery, expert systems, inductive reasoning, decision-making, intelligent systems, data mining, information systems, pattern recognition, machine learning, process control, and so on [46]. Using rough set theory, we can handle uncertainty, vagueness, and imprecision, a technique that is widely used for dimension reduction [45]. A key use of rough set theory is attribution reduction, which means eliminating redundant attributes without losing information [29]. In recent years, various researchers have used rough set theory combined with other methods to reduce and select features [26]. Recently, many studies have used rough set theory and fuzzy neighborhood theory combined with other methods in feature selection [4,9,14,21,22]. The paper is structured as follows: Section 2 includes the fundamental concepts of the rough set and the neighborhood rough theory. Section 3 details the proposed method. Sections 4 and 5 present the simulation results and analyze the experimental findings, respectively. Finally, the concluding section summarizes the key insights and implications of the study.
2 Preliminaries
Feature selection preserves essential information by reducing the dimensionality of financial data. Rough set theory is a soft computing tool with various applications in data science. Data mining is one of the areas where rough set theory is used. There are several studies that prove that rough set theory is a popular and practical tool in feature selection. Most traditional RST-based feature selection methods depend on reduction. In the following, we review the research conducted in the field of feature selection using neighbourhood rough and rough set theory.
2.1. Rough Set Theory
Rough Set Theory (RST) is a mathematical framework for dealing with uncertainty and vagueness in data analysis. It was introduced by Zdzisław Pawlak [36] in the early 1980s and has since become a significant approach in various fields, including data mining, machine learning, and decision-making. RST provides tools for handling imprecise or incomplete information without requiring prior probability distributions.
2.2. Information and decision systems
An information system is a pair ), where:
is a non-empty finite set of objects (the universe).
is a set of attributes (features), where each attribute
is associated with a function
, mapping objects to their attribute values.
is the value set of attribute
.
2.3. Indiscernibility
For any subset of attributes , the indiscernibility relation
groups objects that have the same values for the attributes in
:[36]:
| (1) |
| (2) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (3) |
| (4) |
| (5) |
| (6) |
| (7) |
| (8) |
| (9) |
| (10) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (11) |
| (12) |
| (13) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (14) |
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
U | Job | Marital | Education | Default | Housing | Loan | Contact | Month | Day | Pout come | Class |
| blue-collar | married | basic.4y | unknown | yes | no | telephone | aug | tue | non-existent | 1 |
| housemaid | divorced | degree | no | yes | yes | cellular | nov | thu | non-existent | 1 |
| admin. | married | high. School | no | no | yes | cellular | aug | mon | success | 1 |
| housemaid | divorced | course | no | yes | no | cellular | nov | mon | success | 1 |
| technician | married | degree | no | yes | yes | cellular | may | fri | non-existent | 0 |
| retired | married | degree | no | yes | no | cellular | mar | fri | non-existent | 1 |
| management | single | basic.4y | no | no | no | telephone | may | mon | non-existent | 0 |
| services | married | high.School | unknown | yes | no | telephone | may | mon | non-existent | 1 |
| self-employed | divorced | high.School | no | no | no | cellular | sep | tue | success | 1 |
| admin. | divorced | high.School | no | no | no | telephone | jul | mon | non-existent | 0 |
U | Age | Duration | Campaign | P-days | Previous | Emp.var.rate | Cons.price.idx | Cons.conf.idx | month rate | Nr.employed | Class |
| 53 | 1186 | 4 | 999 | 0 | 1.4 | 93.444 | -36.1 | 4.968 | 5228.1 | 1 |
| 54 | 653 | 1 | 999 | 0 | -0.1 | 93.2 | -42 | 4.076 | 5195.8 | 1 |
| 31 | 155 | 2 | 4 | 1 | -2.9 | 92.201 | -31.4 | 0.884 | 5076.2 | 1 |
| 67 | 655 | 2 | 5 | 5 | -1.1 | 94.767 | -50.8 | 1.039 | 4963.6 | 1 |
| 41 | 170 | 4 | 999 | 0 | -1.8 | 92.893 | -46.2 | 1.313 | 5099.1 | 0 |
| 73 | 179 | 1 | 999 | 0 | -1.8 | 92.843 | -50 | 1.531 | 5099.1 | 1 |
| 32 | 73 | 7 | 999 | 0 | 1.1 | 93.994 | -36.4 | 4.858 | 5191 | 0 |
| 41 | 679 | 2 | 999 | 0 | 1.1 | 93.994 | -36.4 | 4.857 | 5191 | 1 |
| 39 | 261 | 1 | 3 | 1 | -3.4 | 92.379 | -29.8 | 0.788 | 5017.5 | 1 |
| 48 | 352 | 2 | 999 | 0 | 1.4 | 93.918 | -42.7 | 4.96 | 5228.1 | 0 |
No | Attributes | Data Type |
1 | Job | admin., blue-collar, entrepreneur, housemaid, management, retired, self-employed services, student, technician, unemployed, unknown |
2 | Marital | divorced, married, single, unknown |
3 | Education | basic.4y, basic.6y, basic.9y, high. School, illiterate professional. Course, university. Degree, unknown |
4 | Default | no, yes, unknown |
5 | Housing | no, yes, unknown |
6 | Loan | no, yes, unknown |
7 | Contact | cellular, telephone |
8 | Month | Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec |
9 | Day | Mon, Tue, Wed, Thu, Fri |
10 | Pout come | failure, nonexistent, success |
Table 4: Details of Impute Variables of Portuguese Bank Datase | |||
No | Attributes( | Attribute Description | Data Type |
1 | Age | What is the customer's age? | Numeric |
2 | job | What is the customer's business status? | Nominal |
3 | Marital | What is the customer's marital status? | Nominal |
4 | Education | What is the customer's educational status? | Nominal |
5 | Default | What is the customer's credit debt status? | Nominal |
6 | Housing | What is the customer's real estate debt status? | Nominal |
7 | Loan | What is the customer's personal debt status? | Nominal |
8 | Contact | What is the customer's type of relationship? | Nominal |
9 | Month | When was the customer's last month of interview? | Numeric |
10 | Day | When was the last day of interview? | Numeric |
11 | Duration | How long was the last contact? | Numeric |
12 | Campaign | How many customers followed up during the campaign? | Numeric |
13 | P-days | How many times the customer called since the previous campaign | Numeric |
14 | Previous | How many times the customer been called before the campaign? | Numeric |
15 | Pout come | When did the previous marketing campaign end? | Nominal |
16 | Emp.var.rate | Employment Change Rate - Quarterly Index | Numeric |
17 | Cons.price.idx | Consumer Price Index - Monthly Index | Numeric |
18 | Cons.conf.idx | Consumer Confidence Index - Monthly Index | Numeric |
19 | month rate | European Interbank Offered Rate 3 Months - Daily Index | Numeric |
20 | Nr.employed | Number of Employees - Quarterly Index | Numeric |
21 | Customer | Has the customer registered a term deposit? | Binary |
| (15) |
| (16) | ||
|
| (17) |
|
(18)
|
| (19) |
|
|
|
|