Finding all Redacts in Financial Information Systems Based on Neighbourhood Rough Set Theory for Finance Data with Decision Makers Point of View

Amin, Sodabeh; Alavi, Seyed Majid; Mansouri, Parvaneh; Saidofar, Abolfazl

doi:https://doi.org/10.71716/amfa.2026.01208255

Manuscript ID : 202505301208255 Visit : 118 Page: 35 - 60

https://doi.org/10.71716/amfa.2026.01208255

Article Type: Case Study

Finding all Redacts in Financial Information Systems Based on Neighbourhood Rough Set Theory for Finance Data with Decision Makers Point of View

Subject Areas : Financial Mathematics

Sodabeh Amin ¹ , Seyed Majid Alavi ^{2
*} , Parvaneh Mansouri ³ , Abolfazl Saidofar ⁴

1 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
2 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
3 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran
4 - Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran

Received: 2025-05-30 Accepted : 2025-08-14 Published : 2025-07-23

Keywords: Neighborhood rough set theory, Feature selection, Financial Information Systems.,

Abstract :

The Neighborhood Rough Set (NRST) method is a valuable approach for selecting a subset of features from a complete dataset, enabling us to preserve the essential information that the entire feature set provides. In financial datasets, which often contain high-dimensional input features, effective feature selection techniques are crucial to identify the features that yield the most predictable results. In this work, we use neighborhood concepts to discover data dependencies and reduce the number of features in a financial dataset based solely on the data itself, without relying on additional information. This process also includes removing extra features. To facilitate a simple algorithm, we use the properties of neighbourhood rough sets to formulate a Binary Integer Linear Programming (BILP) model. Optimal solutions to these problems are obtained using genetic algorithms. Our approach allows for feature reduction from minimum to maximum cardinality. We demonstrate the efficiency of our proposed method compared to other techniques through various tables showing the results on several benchmark datasets characterized by unbalanced class distributions. The financial dataset used in the present study is taken from the UCI Machine Learning Repository.

References:

[1] Abdel-Basset, M., Abdle-Fatah, L., Sangaiah, A.K., An improved Lévy based whale optimization algo-rithm for bandwidth-efficient virtual machine placement in cloud computing environment, Cluster Com-put, 2019; 22(4): 8319–8334. Doi.org/10.1007/s10586-018-1769-z.
[2] Ahmed, S., Ghosh, K.K., Singh, P.K., Geem, Z.W., Sarkar, R., Hybrid of Harmony Search Algorithm and Ring Theory-Based Evolutionary Algorithm for Feature Selection, IEEE Access, 2020; 8:102629–45. Doi: 10.1109/ACCESS.2020.2997005.
[3] Alazzam, H., Sharieh, A., Sabri, K.E., A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer, Expert Syst Appl, 2020; 148: 113249. Doi: 10.1016/j.eswa.2020.113249.
[4] Alavi, S. M, A., Khazravi. N., Evaluation and ranking of fuzzy sets under equivalence fuzzy relations as α−certainty and β−possibility, Expert Systems with Applications, 2024; 248:(123175): 0957-4174, Doi :10.1016/j.eswa.2024.123175.
[5] Aljarah, I., Faris, H., Mirjalili, S., Optimizing connection weights in neural networks using the whale optimization algorithm, Soft Comput, 2018: 22: 1433–7479. Doi: 10.1007/s00500-016-2442-1.
[6] Behrooz, S., Ghomi, R., Mehrazin, A., Shoorvarzi., M. Developing Financial Distress Prediction Mod-els Based on Imbalanced Dataset , Random Under sampling and Clustering Based Under sampling Approaches, 2024; 9(3):737–62. Doi: 10.22034/amfa.2024.2189537.1689.
[7] Chen, H., Li, T., Fan, X., Luo, C., Feature selection for imbalanced data based on neighborhood rough sets, Inf Sci (Ny,. 2019: 483. Doi: 10.1016/j.ins.2019.01.073.
[8] Coello, C.A., Theoretical and numerical constraint-handling techniques used with evolutionary algorithms: a survey of the state of the art, Comput Methods Appl Mech Eng, 2002; 191(11):1245–87. Doi:10.1016/S0045-7825 (01)00323-4.
[9] Dai, J., Hu, H., Wu, W.Z., Qian, Y., Huang, D., Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets, IEEE Trans Fuzzy Syst, 2018; 26(4): 2174–87. Doi:101109/TFUZZ.2017.2768044.
[10] Dash, M., Liu, H., Feature selection for classification, Intell Data Anal, 1997; 1(1): 131–56. Doi:10.1016/ S1088-467X (97)00008-5
[11] Dhiman, G., Oliva, D., Kaur, A., Singh, K.K, Vimal, S., Sharma, A., et al., BEPO: A novel binary emperor penguin optimizer for automatic feature selection, Knowledge-Based Syst, 2021; 211: Doi:10.1016/ j.knosys. 2020.106560
[12] Dhiman, G., Kumar, V., Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems, Knowledge-Based Syst, 2019; 165: 169–96. Doi: 10.1016/j.knosys.2018.11. 024.
[13] Dua, D., Graff, C., {UCI} Machine Learning Repository. 2017: Available from: http://archive.ics.uci.edu/ ml. Doi: 10.24432/C56C7D
[14] Elango, S., Chandran, S., Mahalakshmi, S., Rough set-based feature selection for credit risk prediction using weight-adjusted boosting ensemble method, Soft Comput, 2020; 24. Gpi: 10.1007/s00500-020-05125-1.
[15] Eskandar, H., Sadollah, A., Bahreininejad, A., Hamdi, M., Water cycle algorithm – A novel metaheuristic optimization method for solving constrained engineering optimization problems, Comput Struct, 2012; 110–111: 151–66. Doi:10.1016/j.compstruc.2012.07.010.
[16] Ghaemi, M., Feizi-Derakhshi, M.R., Forest optimization algorithm, Expert Systems with Applications, 2014; 41(15): 6676–6687. Doi: 10.1016/j.eswa.2014.03.034.
[17] Guoyin, W., Xiao, M., Haiyun Y., Monotonic uncertainty measures for attribute reduction in proba-bilistic rough set model, International Journal of Approximate Reasoning, 2015; 59: 41–67. Doi:10.10 16/j.ijar. 2015.01.002.
[18] Han, J., Kamber, M., Pei, J., Data Mining: Concepts and Techniques , 3rd ed. Waltham, Mass: Morgan Kaufmann Publishers; 2012.
[19] Holland, J.H., Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press; 1992.
[20] Hussien, A.G., Abualigah, L., Abu Zitar, R., Hashim, F.A., Amin, M., Saber, A., et al., Recent Advances in Harris Hawks Optimization: A Comparative Study and Applications, Electronics, 2022; 11(12): Doi:10.3390/ electronics11121888.
[21] Hu, M., Tsang, E.C.C., Guo, Y., Chen, D., Xu, W., A novel approach to attribute reduction based on weighted neighborhood rough sets, Knowledge-Based Systems, 2021; 220: 106908. Doi:10.1016/ j.knosys. 2021.106908.
[22] Jensen, R., Tuson, A., Shen, Q., Finding rough and fuzzy-rough set reducts with SAT, Inf Sci (Ny), 2014; 255: 100–120. Doi: 10.1016/j.ins.2013.08.050.
[23] Kennedy, J., Eberhart, R., Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks,1995; 4: 1942–1948 Doi: 10.1109/ICNN.1995.488968.
[24] Khushaba, R.N., Al-Ani, A., AlSukker, A., Al-Jumaily, A., A Combined Ant Colony and Differential Evolution Feature Selection Algorithm, Berlin, Heidelberg: Springer Berlin Heidelberg, 2008; 1–12. Doi: 10.1007/978-3-540-87527-7-1.
[25] Ranjini, S.K.S., Murugan, S., Memory Based Hybrid Dragonfly Algorithm for Numerical Optimization Problems, Expert Systems with Applications, 2017; 83(C): 63–78. Doi: 10.1016/j.eswa.2017.04.031.
[26] Liang, B., Zhang, H., Lu, Z., Zhang, Z., Indistinguishable Element-Pair Attribute Reduction and Its Incremental Approach, Mathematical Problems in Engineering, 2022; 1–16. Doi: 10.1155/2022/4823216.
[27] Liu, J., Hu, Q., Yu, D., A weighted rough set based method developed for class imbalance learning, Inf Sci (Ny), 2008; 178(4): 1235–1256. Doi: 10.1016/j.ins.2007.09.036.
[28] Liu, S., Zhang, J., Xiang, Y., Zhou, W., Xiang, D., A Study of Data Pre-processing Techniques for Imbalanced Biomedical Data Classification, Int. J. Bioinformatics Res. Appl, 2020; 16(3): 290–318. Doi:10.1504/ijbra. 2020.109103.
[29] Mafarja, M., Mirjalili, S., Whale optimization approaches for wrapper feature selection, Appl Soft Comput, 2018; 62: 441–453. Doi: 10.1016/j.asoc.2017.11.001.
[30] Maksood, F., Achuthan, G., Analysis of Data Mining Techniques and its Applications, Int J Comput Appl, 2016; 140: 6–14. Doi: 10.5120/ijca2016909411.
[31] Murali, V., Fuzzy equivalence relations, Fuzzy Sets Syst, 1989; 30(2): 155–163. Doi: 10.1016/0165-0114(89)90017-1.
[32] Nahari, J., Qala, A., Rezaei, N., Aghdam, Y., Abdi, R., The Performance of Machine Learning Techniques in Detecting Financial Frauds, Advances in Mathematical Finance & Applications Comparing, 2024: 9(3): 1006–1023. Doi :10.71716/amfa.2024.22101813.
[33] Pandey, A. C., Kulhari, A., Mittal, H., Tripathi, A. K., Pal, R., Improved exponential cuckoo search method for sentiment analysis, Multimedia Tools and Applications, 2022; 82(16), 23979–24029. Doi:10.1007/s11042-022-14229-5.
[34] Pawlak, Z., Rough set approach to knowledge-based decision support, Eur J Oper Res, 1997; 99(1): 48–57. Doi: 10.1016/S0377-2217(96)00381-7.
[35] Pawlak, Z., Rough classification, Int J Man Mach Stud, 1984; 20(5): 469–483. Doi:10.1016/S0020-7373(84)80022-X.
[36] Pawlak, Z., Rough sets and intelligent data analysis, Inf Sci (Ny), 2002; 147(1): 1–12. Doi: 10.1016/S0020-0255(02)00197-4.
[37] Pieta, P., Szmuc, T., Kluza, K., Comparative Overview of Rough Set Toolkit Systems for Data Analy-sis, MATEC Web Conf, 2019; 252: 3019. Doi: 10.1051/matecconf/201925203019.
[38] Hu, Q., Yu, D., Liu, J., Wu, N., Neighborhood rough set based heterogeneous feature subset selection, Inf Sci (Ny), 2008; 178(18): 3577–3594. Doi: 10.1016/j.ins.2008.05.024.
[39] Rashedi, E., Nezamabadi-pour, H., Saryazdi, S., GSA: A Gravitational Search Algorithm, Inf Sci (Ny), 2009; 179(13): 2232–2248. Doi: 10.1016/j.ins.2009.03.004.
[40] Sağlam, F., Sözen, M., Cengiz, M.A., Optimization Based Under sampling for Imbalanced Classes, Adıyaman University Journal of Science, 2021; 11(2): 385-409. Doi:10.37094/adyujsci.884120.
[41] Sepehri, A., Ghodrati, H., Jabbari, H., Panahian, H., Making Decision on Selection of Optimal Stock Portfolio Employing Meta Heuristic Algorithms for Multi-Objective Functions Subject to Real-Life Con-straints, Advances in Mathematical Finance & Applications, 2023; 8(2): 645–666. Doi: 10.22034/ AMFA.2021. 1915292.1525.
[42] Shekhawat, S.S., Sharma, H., Kumar, S., Nayyar, A., Qureshi, B., bSSA: Binary Salp Swarm Algorithm With Hybrid Data Transformation for Feature Selection, IEEE Access, 2021; 9: 14867–14882. Doi:10.1109/ACCESS.2020.3047773
[43] Shareef, H., Ibrahim, A.A., Mutlag, A.H., Lightning Search Algorithm, Applied Soft Computing, 2015; 36(C): 315–333. Doi: 10.1016/j.asoc.2015.07.028.
[44] Wang, G.G., Deb, S., Cui, Z., Monarch Butterfly Optimization, Neural Computing and Applications, 2019; 31(7): 1995–2014. DOI: 10.1007/s00521-017-3210-z.
[45] Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R., Feature Selection based on Rough Sets and Particle Swarm Optimization, Pattern Recognition Letters, 2007; 28: 459–471. Doi: 10.1016/j.patrec.2006.09.003.
[46] Zhang, Y., Wang, Y., Research on Classification Model based on Neighborhood Rough Set and Evidence Theory, Journal of Physics: Conference Series, 202: 1746:012018. Doi: 10.1088/1742-6596/1746/1/012018.

Full-Text:

Case Study

a Department of Mathematics and Computer Science, Ar.C., Islamic Azad University, Arak, Iran

Article Info

Article history:

Received 2025-05-30

Accepted 2025-08-14

Keywords:

Neighborhood Rough Set Theory(NRST)

Feature Selection

Financial Information Systems(FIS)

Abstract

1 Introduction

Nowadays, a broad amount of data is created every day and analyzing the data is a serious challenge. By data mining, we refer to techniques in different fields such as information technology, mathematical science, and statistical analysis [30,40]. Data mining technique are useful to analyze, comprehend, and visualize large sums of data kept at data warehouses, databases, or other types of data repositories [31]. Through data mining techniques, we can handle extensive datasets including countless number of features. The big number of features that such datasets represent creates issues for data miners as some of them might be irrelevant to the data mining techniques. These irrelevant features can degrade the quality of the results gained through data mining, which in turn lowers the opportunity to find useful knowledge out of the dataset. Feature selection is one of the approaches to handle such features [18]. The big data are utilized for accurate classification in science and engineering fields like astronomy, medicine and so on. Still, such datasets feature insignificant redundant and nosy characteristics. These features may decrease the classifier efficiency. Selecting the proper features is important to solve the issue. Thus, in many fields of study, the choice of feature has a key role to play [12]. Selection of features can be a way to lower the redundant and irrelevant features and have a higher clustering accuracy and performance [11]. In real-world applications, the distribution of classes in data collected from road traffic, medicine, credit cards, banking transactions, and the stock market may be uneven. Clearly, it is not possible to discover the hidden knowledge out of real-world datasets before developing novel and optimal techniques for feature selection in imbalanced data. Imbalanced data affects the performance of predictive models [6]. So far, various algorithms have been introduced for the selection of features in imbalanced data. In many studies, the search power of meta-heuristic algorithms has been used in algorithms designed to select features [28,41]. Particle Swarm Optimization (PSO) [23], Differential Evolution (DE) [24], Gravitational Search Algorithm (GSA) [40], Water Cycle Algorithm (WCA) [15], Forest Optimization Algorithm (FOA) [16], Cuckoo Search (CS) [33], Lightning Search Algorithm (LSA) [43], Monarch Butterfly Optimization(MBO) [44] and etc. are some of the meta-heuristics employed to solve function choice problem [41,42]. There had been a success instances of the usage of meta-heuristic algorithms for specific engineering and medical troubles like optimized connection weights with inside the neural network [5], Numerical Optimization [25], cloud computing [1], and stock market index prediction [20]. Due to the advancement of soft computing, researchers in finance, computer science, and mathematics have paid great attention to optimization research [6]. In addition, several algorithms for selecting features have been proposed to diagnose, classify, categorize, and detect patterns are available [2,3,3]. Following the introduction of rough set theory in 1981 by Pawlak [35], it has been used as a successful technique to select features in classified data. It is used in artificial intelligence and cognitive sciences in several different fields including knowledge discovery, expert systems, inductive reasoning, decision-making, intelligent systems, data mining, information systems, pattern recognition, machine learning, process control, and so on [46]. Using rough set theory, we can handle uncertainty, vagueness, and imprecision, a technique that is widely used for dimension reduction [45]. A key use of rough set theory is attribution reduction, which means eliminating redundant attributes without losing information [29]. In recent years, various researchers have used rough set theory combined with other methods to reduce and select features [26]. Recently, many studies have used rough set theory and fuzzy neighborhood theory combined with other methods in feature selection [4,9,14,21,22]. The paper is structured as follows: Section 2 includes the fundamental concepts of the rough set and the neighborhood rough theory. Section 3 details the proposed method. Sections 4 and 5 present the simulation results and analyze the experimental findings, respectively. Finally, the concluding section summarizes the key insights and implications of the study.

2 Preliminaries

Feature selection preserves essential information by reducing the dimensionality of financial data. Rough set theory is a soft computing tool with various applications in data science. Data mining is one of the areas where rough set theory is used. There are several studies that prove that rough set theory is a popular and practical tool in feature selection. Most traditional RST-based feature selection methods depend on reduction. In the following, we review the research conducted in the field of feature selection using neighbourhood rough and rough set theory.

2.1. Rough Set Theory

Rough Set Theory (RST) is a mathematical framework for dealing with uncertainty and vagueness in data analysis. It was introduced by Zdzisław Pawlak [36] in the early 1980s and has since become a significant approach in various fields, including data mining, machine learning, and decision-making. RST provides tools for handling imprecise or incomplete information without requiring prior probability distributions.

2.2. Information and decision systems

An information system is a pair ), where: is a non-empty finite set of objects (the universe). is a set of attributes (features), where each attribute is associated with a function , mapping objects to their attribute values. is the value set of attribute .

2.3. Indiscernibility

For any subset of attributes , the indiscernibility relation groups objects that have the same values for the attributes in :[36]:

If , then and are indiscernible by attributes in . The equivalence classes of the relation are given .

2.4. Lower and Upper Approximations

Let and . be estimated based on the information within by constructing the -lower and -upper approximations of :

(1)

As mentioned above, the upper approximation set has all elements needed to classify it as. The lower approximation has the minimum set of the feasible elements of. It is such a tuple that is named a rough set [34].

2.4.1. Feature Dependency and Significance

In Rough Set Theory, the concept of positive region is crucial for understanding how certain attributes can classify objects into specific classes. The positive region, denoted as , refers to the set of objects that can be definitively classified into a target set based on the information provided by a given set of attributes. The positive region with respect to the decision attribute and the subset is defined as:

	(2)
	(3)

represents the indiscernibility relation defined by a subset of attributes (which could include all attributes or a selected subset). The set consists of those objects in the universe whose equivalence class (under the indiscernibility relation) is entirely contained within the target set . The quality of classification in Rough Set Theory is another key concept that evaluates the accuracy and predictive capability of classification models. Rough Set Theory is recognized as an analytical tool for processing data and extracting information from it without requiring prior assumptions about the distribution of the data.

(4)

Where the numerator counts the number of objects that can be positively classified into the target set . contains the total number of objects in the universe. The value of varies between and .

1. If : which means that all items in the universe may be definitely categorized into the target set D, the classification is complete.

2. If : This indicates that none of the objects in the universe can be classified into .

3. A higher value of indicates better classification quality, meaning a greater proportion of objects can be successfully classified into the desired class.

The significance of an attribute can be evaluated primarily based on its ability to help distinguish between different classes. An attribute is considered significant if its presence improves the classification quality. The importance of an attribute relative to a decision attribute can be quantitatively assessed using various measures, such as:

(5)

Where) is the classification accuracy when attribute is included and is the classification accuracy when attribute is excluded.

2.4.2. Reducts and Core

A reduct of a set of features (attributes) in a dataset is a minimal subset of features that preserves the classification ability of the original set. In other words, if is the original set of attributes and is a reduct, then: and the classification using is equivalent to that using .The core is the intersection of all reducts of a set of features. It consists of those attributes that are essential for maintaining the classification capability. If is the core, then (7). Where represents all possible reducts of the attribute set , it is represented as . This means that every reduct must include the attributes in the core. In summary, while a reduct provides a minimal subset of attributes sufficient for classification, the core identifies those attributes that are indispensable across all possible reducts.

(6)

2.4.3. Neighbourhood Rough Set Theory

Neighbourhood Rough Set Theory (NRST) is an extension of classical rough set theory that incorporates the concept of "neighbourhood" to handle continuous and numerical data without the need for discretization. It defines neighbourhoods based on distance metrics and uses them to approximate decision classes, enabling feature selection, classification, and noise reduction. In financial applications such as stock price prediction, credit scoring, fraud detection, and portfolio optimization NRST can be used to identify relevant features or patterns from complex and noisy datasets. However, despite its strengths, NRST has several limitations when applied to real-world financial data, which we will now explore in detail The performance of NRS is highly dependent on the choice of parameters like the neighbourhood radius or the number of nearest neighbours. Inappropriate values can lead to overfitting or under fitting. Too small ε results in too few neighbours, resulting in unstable approximations. Too large ε leads to the inclusion of irrelevant samples, resulting in a loss of discriminatory power.

For example, in credit risk assessment, choosing an inappropriate can cause high-risk applicants to be grouped with low-risk applicants, leading to poor decision-making.

We consider an information system , where is a finite and non-empty set of samples , known as a universe. The set represents attributes (also called inputs, features, or variables) that define the characteristics of the samples. In this context, forms a decision table when, where, represents the set of condition attributes and and is the decision attribute. For an object and a subset the neighbourhood of in feature space is defined as follows:

(7)

This definition helps us identify objects that are similar to based on their attribute values within a specified distance The notation represents a distance function, which satisfies the following properties for any :

(1) (The distance is always non-negative.)

(2) if and only if , (The distance between two identical points is zero.)

(3) (The distance between two points is symmetric.)

(4) . (The triangle inequality holds.)

In this context, defines a distance function, and the pair represents a metric space. For any two samples a commonly used metric is the Euclidean distance, defined as follows:

(8)

In this equation m represents the number of dimensions or features in the dataset, and denote the values of the feature for samples and , respectively. The Euclidean distance measures the straight-line distance between two points in a multi-dimensional space, providing a quantitative measure of how similar or different the two samples are. It satisfies the properties of a metric, including non-negativity, identity of indiscernible, symmetry, and the triangle inequality. Thus, forms a valid metric space.

2.4.4. Lower and Upper Neighbourhood Approximation

In the hybrid decision system, where and , Nq denotes a neighborhood relation. The lower and upper approximations of with respect to the attribute set are defined by the following formulas [38]:

(9)

Obviously, (X). The boundary region ofis defined as follows:

(X)=	(10)
(X)=	(11)

Typically, the partition induces, indicating a specific class. The approximations described above provide a method to characterize the specific class based on neighbourhood sets. The positive and negative regions of with respect to are defined as:

(X)

(12)

2.4.5. Example

The hybrid decision system of a Portuguese banking institution consists of nominal and numerical attributes, which are presented in Table 1 and Table 2, respectively. Direct marketing campaigns serve as the data source for the banking system. The dataset includes features and customer records. In Tables 3 and 4, the data types, their descriptions, and corresponding categorical values are presented. The "Selector" serves as a class label to categorize groups The campaigns employed phone calls, and in many instances, multiple calls were necessary for a single client to determine whether a product was subscribed or not . We analyzed the information in Table 1 and Table 2 using neighborhood rough set theory Clearly, the conditional attributes in the aforementioned tables consist of mixed data. Therefore, the decision system is defined as a hybrid where represents a combination of nominal and numerical data.

Step1: The numerical values in the decision table are normalized to a range between and.

Step2: By applying Equations (10) and (11), we obtain the following values:

(X)		(13)
	(14)

Step3: We compute the approximations of the decision classes using Equations (12) and (13). Additionally, we can divide the decision attribute into two subsets based on equivalence relations using the decision classes:

Step4: The degree of dependency of the decision classes can be calculated using Equation (5):

Table 1: Sample of Portuguese banking institution nominal attributes.

Table 2: Sample of Portuguese banking institution numeric attributes.

U	Job	Marital	Education	Default	Housing	Loan	Contact	Month	Day	Pout come	Class
	blue-collar	married	basic.4y	unknown	yes	no	telephone	aug	tue	non-existent	1
	housemaid	divorced	degree	no	yes	yes	cellular	nov	thu	non-existent	1
	admin.	married	high. School	no	no	yes	cellular	aug	mon	success	1
	housemaid	divorced	course	no	yes	no	cellular	nov	mon	success	1
	technician	married	degree	no	yes	yes	cellular	may	fri	non-existent	0
	retired	married	degree	no	yes	no	cellular	mar	fri	non-existent	1
	management	single	basic.4y	no	no	no	telephone	may	mon	non-existent	0
	services	married	high.School	unknown	yes	no	telephone	may	mon	non-existent	1
	self-employed	divorced	high.School	no	no	no	cellular	sep	tue	success	1
	admin.	divorced	high.School	no	no	no	telephone	jul	mon	non-existent	0

Table 3: Features Description of The Bank Marketing Financial Data Set

U	Age	Duration	Campaign	P-days	Previous	Emp.var.rate	Cons.price.idx	Cons.conf.idx	month rate	Nr.employed	Class
	53	1186	4	999	0	1.4	93.444	-36.1	4.968	5228.1	1
	54	653	1	999	0	-0.1	93.2	-42	4.076	5195.8	1
	31	155	2	4	1	-2.9	92.201	-31.4	0.884	5076.2	1
	67	655	2	5	5	-1.1	94.767	-50.8	1.039	4963.6	1
	41	170	4	999	0	-1.8	92.893	-46.2	1.313	5099.1	0
	73	179	1	999	0	-1.8	92.843	-50	1.531	5099.1	1
	32	73	7	999	0	1.1	93.994	-36.4	4.858	5191	0
	41	679	2	999	0	1.1	93.994	-36.4	4.857	5191	1
	39	261	1	3	1	-3.4	92.379	-29.8	0.788	5017.5	1
	48	352	2	999	0	1.4	93.918	-42.7	4.96	5228.1	0

No	Attributes	Data Type
1	Job	admin., blue-collar, entrepreneur, housemaid, management, retired, self-employed services, student, technician, unemployed, unknown
2	Marital	divorced, married, single, unknown
3	Education	basic.4y, basic.6y, basic.9y, high. School, illiterate professional. Course, university. Degree, unknown
4	Default	no, yes, unknown
5	Housing	no, yes, unknown
6	Loan	no, yes, unknown
7	Contact	cellular, telephone
8	Month	Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec
9	Day	Mon, Tue, Wed, Thu, Fri
10	Pout come	failure, nonexistent, success

3 Proposed Method

As mentioned earlier, formal Rough Set Theory (RST) can be used to reduce the dimensionality of data sets. This method can also serve as a pre-processing step for selecting the modelling approach to learn from the data. One of the main advantages of RST is its ability to provide abstraction, allowing it to be combined with various mathematical structures to achieve satisfactory results.

In this section, we introduce a method for reducing attributes by applying distance in the boundary region of neighbourhood rough set theory. Many existing rough-set-based feature selection methods rely on information obtained from the lower approximation of a set to minimize data. These methods are often seen as reliable because they focus on the certainty represented by the lower approximation, making them crucial for scientific analysis. However, while these approaches are generally effective, they tend to overlook the information found in the boundary or uncertainty region.

Additionally, some methods utilize the upper approximation. This means they evaluate the upper approximation as a whole, rather than treating the boundary region and lower approximation as distinct entities. To feature selection subsets, our proposed method considers both the lower and upper approximations of the information within the boundary region. As a result, we expect that the subset chosen through this method will be smaller than what would be obtained from using either the upper or lower approximation alone.

3.1. Distance Metric

Machine learning algorithms are widely used for data classification in various real-world scenarios. Selecting an efficient distance metric is crucial in this context. Distance metrics can enhance clustering and classification by identifying similarities between data and improving information retrieval. For mixed data types, rather than calculating distances separately for nominal and numerical variables, a single formula can be employed. In a hybrid decision system represented as ), denotes a set of conditional attributes that consists of two parts: (numerical attributes) and (nominal attributes). Thus, . Let be a subset of , meaning , where , , , . The neighbourhood relation R for the attribute set K is defined as:

Table 4: Details of Impute Variables of Portuguese Bank Datase
No	Attributes()	Attribute Description	Data Type
1	Age	What is the customer's age?	Numeric
2	job	What is the customer's business status?	Nominal
3	Marital	What is the customer's marital status?	Nominal
4	Education	What is the customer's educational status?	Nominal
5	Default	What is the customer's credit debt status?	Nominal
6	Housing	What is the customer's real estate debt status?	Nominal
7	Loan	What is the customer's personal debt status?	Nominal
8	Contact	What is the customer's type of relationship?	Nominal
9	Month	When was the customer's last month of interview?	Numeric
10	Day	When was the last day of interview?	Numeric
11	Duration	How long was the last contact?	Numeric
12	Campaign	How many customers followed up during the campaign?	Numeric
13	P-days	How many times the customer called since the previous campaign	Numeric
14	Previous	How many times the customer been called before the campaign?	Numeric
15	Pout come	When did the previous marketing campaign end?	Nominal
16	Emp.var.rate	Employment Change Rate - Quarterly Index	Numeric
17	Cons.price.idx	Consumer Price Index - Monthly Index	Numeric
18	Cons.conf.idx	Consumer Confidence Index - Monthly Index	Numeric
19	month rate	European Interbank Offered Rate 3 Months - Daily Index	Numeric
20	Nr.employed	Number of Employees - Quarterly Index	Numeric
21	Customer	Has the customer registered a term deposit?	Binary

In this formula, calculates as follows:

(15)

This expression evaluates the distance between objects in numerical data as and the distance in nominal data as. According to the equation, if objects and share the same nominal value, the distance is ; otherwise, it is .

When features are on different scales, this creates a challenge. Before calculating distances for numerical data, we must ensure the data is normalized; otherwise, one feature can dominate others. A common normalization method is min-max normalization, which converts the maximum and minimum values of a feature to and , respectively, with other values falling between and . The min-max normalization for numerical data is defined as:

	(16)

3.2. Feature Selection Algorithm

Feature selection is a combinatorial optimization problem that plays a pivotal role in the field of data mining. This process significantly enhances the performance of learning algorithms by removing irrelevant and redundant features. In essence, feature selection entails the identification of a subset of features from the original feature set, thereby facilitating the extraction of patterns within a dataset and optimizing performance according to predefined objectives and criteria. The rough set theory approach to feature selection focuses on identifying minimal attribute sets and utilizes meta-heuristic algorithms to construct high-quality classifiers based on the selected features.

Let denote a collection of training samples, where and represents an -dimensional feature vector, while indicates the corresponding class label, with signifying the total number of samples. Define as the set of samples belonging to the majority class and as those belonging to the minority class. The notation represents the samples from the majority class , while includes samples from the minority class , collectively forming the positive region. Additionally, Additionally and denote the boundary regions.

In this context, if two elements are neighbours, their neighbourhood structure remains preserved following feature selection, ensuring that the positive region remains invariant. Conversely, elements that are not initially neighbours may become neighbours as a consequence of the feature selection process. The objective is to identify the factor , where , which selects the optimal subset of features for classification purposes.

Subject to:

(17)

This optimization problem aims to minimize the selection of features while adhering to specific constraints that ensure effective separation between positive and negative classes in the dataset. The constraints dictate that the pairwise distances must satisfy certain conditions based on the classification of the samples. Specifically, if a sample belongs to the negative class and another sample lies in the boundary or neighborhood of the positive class, their distance must be greater than zero or exceed a predefined threshold . These conditions ensure that selected features maintain a clear distinction between classes, ultimately enhancing the model's classification performance. The binary variable indicates whether a feature is included or excluded facilitating dimensionality reduction while preserving essential information for accurate predictions.

be a solution to (20), then is a reduct with minimum cardinality. If

Then . We represent the product of non-zero component of by as follows:

Considering

(18)

There are several ways to solve constrained optimization problems, one of which is the Penalty Function method. In this approach, proposed solutions can violate the problem's constraints, but each violation incurs a penalty based on its severity. This penalty affects the quality of the solution by altering the objective function value. For example, in a minimization problem, the penalty function increases the objective function, making the solution worse. While there are many ways to define a penalty function, certain key principles should guide its design. Based on the issues discussed and Example 2.4.5, we can draw the following conclusions:

(19)

Equation (20) can be optimized using these constraints. The optimization method used here is similar to the feature selection approach discussed in this article. Various feature selection methods have been proposed, including heuristic methods for large datasets, which have proven to be effective. Meta-heuristic optimization algorithms can be slightly modified for different optimization problems.

With the increasing complexity of real-world issues and the need for quick solutions, traditional methods often fall short, leading to a rise in random algorithms. As a result, the use of heuristic and meta-heuristic algorithms has grown significantly over the past few decades. Unlike classical methods, heuristic search methods explore the search space in parallel and rely on a single fitness function to guide their search, leveraging swarm intelligence. Examples of these methods include the bird population algorithm, genetic algorithm, and firefly algorithm. Meta-heuristic algorithms can be categorized into population-based and path-based methods. The genetic algorithm discussed in this paper utilizes a set of strings, and its basic concepts will be explained further in the next sections. Many meta-heuristics employ stochastic optimization, meaning that the solutions found depend on randomly generated variables. In combinatorial optimization, meta- heuristics can efficiently find good solutions by exploring a large set of feasible options, often requiring less computational effort than traditional optimization algorithms or simple heuristics.

Genetic Algorithm

Genetic Algorithms (GAs) are optimization techniques inspired by the process of natural selection. They are used to solve complex problems by mimicking the evolutionary processes that occur in nature. The main components of a GA include a population of potential solutions, a fitness function to evaluate these solutions, and operators such as selection, crossover, and mutation. GAs are particularly useful for optimization problems where traditional methods may struggle, as they explore a wide solution space and can escape local optima [19]. Here is the pseudocode for the genetic algorithm:

Alghorithm1: Pseudo-code of Genetic Algorithm

Estimating VaR and CoVaR by Using Neural Network Quantile Regression in Iranian Stock Indices
Print Date : 2025-07-23
The Effect of Risk Management on the Speed of Adjustment of Commercial Credit by Considering the Role of Structural Characteristics of Companies' Management
Print Date : 2025-07-23
Uncertainty Quantification and Human-Centric Risk Control via Neural–PDE Integration in Complex Volatile Systems
Print Date : 2025-04-23
Forecasting Influential Factors in Preventing Tax Evasion Through a Lemur Optimization Approach Utilizing a Perceptron Neural Network
Print Date : 2025-01-17
Mean-AVaR-Skewness-Kurtosis Optimization Portfolio Selection Model in Uncertain Environments
Print Date : 2024-10-29
Foresight of Financial Resilience of Entrepreneurial Businesses Using Causal Layered Analysis (CLA)
Print Date : 2024-09-26

Share To

Article Url

Finding all Redacts in Financial Information Systems Based on Neighbourhood Rough Set Theory for Finance Data with Decision Makers Point of View