• Home
  • Decision tree model
    • List of Articles Decision tree model

      • Open Access Article

        1 - Comparison of Data Mining Models Performance in Rainfall Prediction Using Classification Approach (Case Study: Hamedan Airport Synoptic Weather Station)
        Morteza Salehi Sarbijan Hamidreza Dezfoulian
        Background and Aim: Rainfall is one of the complex natural phenomena and one of the most crucial component of the water cycle, playing a significant role in assessing the climatic characteristics of each region. Understanding the amount and trends of rainfall changes is More
        Background and Aim: Rainfall is one of the complex natural phenomena and one of the most crucial component of the water cycle, playing a significant role in assessing the climatic characteristics of each region. Understanding the amount and trends of rainfall changes is essential for effective management and more precise planning in agricultural, economic, and social sectors, as well as for studies related to runoff, droughts, groundwater status, and floods. Additionally, rainfall prediction in urban areas has a significant impact on traffic control, sewage flow, and construction activities. Method: The objective of this study is to compare the accuracy of classification models, including Chi-squared Automatic Interaction Detector (CHAID), C5 decision tree, Naive Bayes (NB), Quest tree, and Random Forest, k-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Artificial Neural Network (ANN) in predicting rainfall occurrence using 50 years of data from the synoptic station at Hamedan Airport. In this study, 80% of the data is used for training the models, and 20% for model validation and the results obtained from the model executions are compared using metrics such as confusion matrix, Receiver Operating Characteristic (ROC) curve, and the Area Under the Curve (AUC) index. To create the classification variable for rainfall and non-rainfall data, based on rainfall data, the days of the year are categorized into two classes: days with rainfall (y) and days without rainfall (n). Data preprocessing is performed using Automatic Data Preprocessing (ADP). Then, Principal Component Analysis (PCA) is employed to reduce the dimensions of the variables. Results: In this study, the PCA method reduces the dimensions of the variables to 5. Also, approximately 80% of the available data corresponds to rainless days, while 20% corresponds to rainy days. The research results indicated that the KNN model with an accuracy of 91.9% for training data and the SVM model with 89.13% for test data exhibit the best performance among the data mining models. The AUC index for the KNN model is 0.967 for training data and 0.935 for test data, while for the SVM algorithm, it is 0.967 for training data and 0.935 for test data. According to the ROC curve for Hamedan rainfall data, the KNN model outperforms other models. Considering the sensitivity index in the confusion matrix, the KNN and SVM models perform better in predicting non-rainfall occurrence for training data. In terms of the precipitation occurrence prediction, the RT and KNN models show better results according to the specificity index. Conclusion: The results demonstrated that for the RT, C5, ANN, SVM, BN, KNN, CHAID, QUEST, accuracy metrics was obtained 86.82%, 89.78%, 89.55%, 89.96%, 88.06%, 91.9%, 88.29%, 87.46%, 91.9%, respectively for training data. Moreover, for test data, the accuracy metrics for this model was obtained 83.82%, 87.9%, 88.12%, 89.13%, 87.12%, 89.13%, 87.12%, 88.19%, 86.93%, 86.76%, respectively. The AUC index in the training data for RT, C5, ANN, SVM, BN, KNN, CHAID QUEST models was 0.94%, 0.99%, 0.94%, 0.94%, 0.93%, 0.97%, 0.93%, 0.89%, respectively. In addition, for the test data, this metric was evaluated 0.89%, 0.89%, 0.93%, 0.94%, 0.92%, 0.90%, 0.92%, 0.88% respectively. As observed, considering accuracy metric and AUC index for training data KNN model and for test data SVM model were more sufficient in rainfall prediction.  Manuscript profile
      • Open Access Article

        2 - Applying Rough Developed theoretical Models (ERST), Interpretation-Structural Analysis (ISM) and Decision Tree (CART) for Help Auditors to Identify Fraud in the Financial Statements of Companies Listed on the Stock Exchange of Iran
        Davood Hasanpoor hasan valiyan mehdi safari griyly Reza Tahmasbizadeh
        The Purpose of this Research is Rough Set Theory Developed Using Model (ERST) to Assist Auditors to Identify Fraud in the Financial Statements of Iranian Companies Listed on Stock Exchange. The method of this combined research is based on the adaptation of theoretical f More
        The Purpose of this Research is Rough Set Theory Developed Using Model (ERST) to Assist Auditors to Identify Fraud in the Financial Statements of Iranian Companies Listed on Stock Exchange. The method of this combined research is based on the adaptation of theoretical foundations through the critical evaluation method to identify the characteristics and criteria of fraud in the financial statements (x) and the characteristics of committing fraud through them (y) and based on the decision tree (CART) and the developed Rough Theory Model (ERST) are seeking to determine the most effective criteria for fraud and how it can be applied in financial statements. The statistical population of the study consisted of 12 expert auditors selected through targeted and homogeneous sampling. In this study 18 indicators were identified as criteria for fraud and 5 attributes as ways of committing fraud. The results of this study showed that, based on the result of the management decision tree (CART) as the most important indicator of fraud, according to the developed Rough Theory Model (ERST), accounts receivable are considered as the most important feature of fraudulent behavior. Accordingly, in the conclusion of this research, for determining the fraud in the financial statements, we can use two indicators of low inventory sale (X12) and high management ownership (X17) based on changes in accounts receivable. Manuscript profile
      • Open Access Article

        3 - Corporates Manner and Comparing its Prediction Accuracy with Decision Tree and Bayes Models
        zohre arefmanesh vahid zare mehrjardi Alireza Mohammadi nodooshan
        The main objective of this study is to design corporate financial distress prediction models for the following three industries basic metals, non-metallic minerals and machinery and equipment, using the bagging model. Moreover, the prediction accuracies of the designed More
        The main objective of this study is to design corporate financial distress prediction models for the following three industries basic metals, non-metallic minerals and machinery and equipment, using the bagging model. Moreover, the prediction accuracies of the designed models are compared to the bayes and decision tree models. Aimed Statistical population of this research includes all the corporations of each of the industries. The financial distress criterion employed in this research is the criteria of article 141 in commercial code and the timeline of the research is from 2001 to 2016. The results shows that, comparing to the base models (i.e. decision tree and bayes), the bagging model has a better prediction accuracy average. Moreover, based on the obtained results, it can be concluded that the bagging, decision tree and bayes models are qualified models for the corporate bankruptcy prediction Manuscript profile