• List of Articles Data mining

      • Open Access Article

        1 - Predicting stock prices using data mining methods.
        Mojtaba Hajigholami
        This article discusses data mining methods for predicting financial markets and analyzing sustainable development in financial matters. It also examines the impact of using data mining methods in the stock market and their effectiveness in this area. The research introd More
        This article discusses data mining methods for predicting financial markets and analyzing sustainable development in financial matters. It also examines the impact of using data mining methods in the stock market and their effectiveness in this area. The research introduces a machine learning approach that generates information using publicly available data and uses this information for accurate prediction. It also explores various data mining methods relevant to financial market analysis, focusing on predicting stock market movements and trends. The study demonstrates that due to the dynamic and variable nature of financial markets influenced by economic, political, and social factors, the use of machine learning and data mining methods can lead to more accurate predictions of stock price movements. Given the extensive and complex data in financial markets, data mining methods have the potential to discover hidden patterns and determine relationships between various variables. Various machine learning algorithms such as artificial neural networks, support vector machines, and random forests, alongside statistical analyses, help improve the analytical capabilities of analysts and investors in making economic decisions. Furthermore, the use of big data and complex analyses has contributed to the development of intelligent trading strategies that can help optimize returns on investments. For example, analysts can enhance the accuracy of their predictions by incorporating sentiment data from social networks into their models. The study emphasizes that sustainable development in financial markets requires a deeper understanding and more precise analysis of data, ultimately leading to stronger data-driven decision-making and trading processes. Manuscript profile
      • Open Access Article

        2 - Presenting a new model for ATM demand scenario
        Alireza Agha Gholizadeh Sayyar Mohamadreza Motadel Alireza Pour ebrahimi
        In today's competitive world, the ability to recognize predict customer demand is an important issue for the success of organizations. And since ATMs are one of the most important channels for cash distribution and one of the most fundamental criteria for assessing the More
        In today's competitive world, the ability to recognize predict customer demand is an important issue for the success of organizations. And since ATMs are one of the most important channels for cash distribution and one of the most fundamental criteria for assessing the level of service to banks,In this paper, the number of referrers to ATM devices is reviewed based on the timing and location of the devices. This article seeks to find a dynamic and functional model for predicting the number of referrers to each ATM depending on the time and location of the device. Hence, 378 ATM machines were used throughout the city of Tehran for a time period of one month, containing 69,418 records. Finally, with the help of clustering of statistical data in spatial and temporal dimensions, this model finally succeeds in learning the pattern in the macro data, and based on the decision tree, the predictor can predict the number of referents to each device, which after the algorithm is presented. In order to improve the quality of banking services and improve the performance of the ATM network, it is proposed to combine the optimal location of ATMs in spatial and temporal dimensions. Manuscript profile
      • Open Access Article

        3 - A Survey of Internal and International Experiences in the Future Studies System of Web-Based Environmental Scanning
        morteza tizchang mina ebrahimi
        Abstract: Horizon scanning, dramatically in recent years, has been used throughout in the world. Although there is not an accurate definition about it, but in general it is a systematic process used the best available information to select the best possible decisions. A More
        Abstract: Horizon scanning, dramatically in recent years, has been used throughout in the world. Although there is not an accurate definition about it, but in general it is a systematic process used the best available information to select the best possible decisions. A systematic search for new trends, opportunities and constraints that effect on the path of objective sets. Explicit goals of horizon scanning are critical for innovative and developmental issues, knowledge, data stored, and decision making. Key technical solutions used in environmental scanning processes are web browsing, indexing, mapping knowledge, citation, lexical, coincidental, page rank algorithm, html analytical algorithm, data mining, Classification and clustering. During the writing of this research, efforts have been made to identify institutions and companies that are using or serving in the field of web-based environmental scanning, identification of their goals and solutions, as well as identifying and studying techniques and tools that they use. The result of this research should help the internal organization to use this new method to reach their goals. Manuscript profile
      • Open Access Article

        4 - A Model for predicting the need for orthopedics surgery by using data mining techniques
        Seyed Sina Fatemi Razavi Seyed Abdollah Amin Mousavi
        By expanding the use of computers in various aspects of people's lives, a huge amount of data is generated. Mostly this data contains valuable information. Data mining can enable us to extract required information and benefit from them. Data mining enables us to identif More
        By expanding the use of computers in various aspects of people's lives, a huge amount of data is generated. Mostly this data contains valuable information. Data mining can enable us to extract required information and benefit from them. Data mining enables us to identify hidden patterns in data sets and use them for prediction. One of the areas that is faced with the massive production of data is the area of treatment. This study will focus in particular on orthopedics. This research is looking for using technology and data mining techniques from existing data in hospital's database to reach valuable information and predict possibility of breaks which require orthopedics surgery. This may support doctors to make their decisions easier, faster and more accurately in serving patients. This research is conducted by using the CRISP methodology. The result of this research shows that the combination of the CHAID algorithm and the Boosting cumulative amplified neural network can provide the desired accuracy in prediction of the need for orthopedics surgery. Manuscript profile
      • Open Access Article

        5 - Presenting a Model for Predicting and Improving Production Quality Using Decision Tree Algorithms and Linear Planning (Case Study: TIBA Wave Generating Companies in Iran)
        Nadereh Sadat Rastghalam Roya Roya M.ahari Ahmad Reza Shekarchizadeh Atefeh Amindost
        Today, most industries and factories in the country use statistical quality control tools to improve product quality, but due to the high volume of data, now there is a need for a more powerful tool that can control statistical quality control processes, given the exten More
        Today, most industries and factories in the country use statistical quality control tools to improve product quality, but due to the high volume of data, now there is a need for a more powerful tool that can control statistical quality control processes, given the extent Data Mining Algorithms and Its Ability to Discover Rules In this research, data mining tools have been used to improve the quality control process and increase it. The method is that first the failure database is formed and after collecting quality control data, the accuracy of predicting the quality of parts is determined using different decision tree algorithms and in the next step using Modeling, Coverage Analysis, Data Each of the rules is evaluated, and finally the workstations are evaluated using the rules that apply to each workstation. Accordingly, in this study, the statistical population of all Tiba surge arresters in 1398. The attributes consist of 9 workstations. Based on the results, the best algorithm in predicting C5 failure is and the most important attributes selected by it are determined as the most important attributes, which are: Cooling quality, hole quality and cutting quality. Also, the evaluation of the rules has been done using the model of cover analysis, data and the most important rules have been extracted. Finally, based on solving the model, the devices that will be in the corrective priority for the current year are: Rowling , Solder and cutting Manuscript profile
      • Open Access Article

        6 - Credit Facilities Applicants Classification by SVM
        A. Toloei ashlaghi H. Nikoomaram F. i Maghdoori sharabian
        In the banking industry, one issue that must always be considered by the credit policy makers is riskmanagement. Among various risks which banks are dealing with, credit risk is most important. It iscaused by the losses of disability or lack of tendency of borrowers to More
        In the banking industry, one issue that must always be considered by the credit policy makers is riskmanagement. Among various risks which banks are dealing with, credit risk is most important. It iscaused by the losses of disability or lack of tendency of borrowers to pay their credit obligations. Tomanage and control the mentioned risk, classification systems are undeniable requirement. Suchsystems, according to existent documents and information, determine the class of customers. It isevident that use of these systems helps bank to choose customers in a good way and through thecontrol and reduction the credit risk, improves efficiency level of providing bank facilities. In Thisresearch artificial intelligent based classification model consist of support vector machine is used topredict bank legal customers financial performance. Indeed, in this paper SVM is used with othermechanisms like F-score and Grid search to increase the accuracy of the model and classify the legalcustomers. The results justify the improvements in the classification accuracy and demonstrate thatSVM can provide the better accuracy than other models Manuscript profile
      • Open Access Article

        7 - Diagnosis of hyperlipidemia in patients based on an artificial neural network with pso algorithm
        asma naeimi minoo soltanshahi amir rajabi
      • Open Access Article

        8 - Developing A Fault Diagnosis Approach Based On Artificial Neural Network And Self Organization Map For Occurred ADSL Faults
        Vahid Golmah Mina Tashakori
      • Open Access Article

        9 - Document Analysis And Classification Based On Passing Window
        ZAHER BAMASOOD
      • Open Access Article

        10 - Prediction of Student Learning Styles using Data Mining Techniques
        Esther Khakata Vincent Omwenga Simon Msanjila
      • Open Access Article

        11 - Optimization of weighting-based approach to predict and deal with cold start of web recommender systems using cuckoo algorithm
        reza molaee fard
      • Open Access Article

        12 - Data mining on the ossibility study of urban fabric's physical change modeling
        ناجی پژمان ضیایی محمد نقی زاده سید مصطفی مختاباد
        Urban expansion has been a very important topic not only in the management of sustainable development, but also in the fields of remote sensing and geographic information science. Urban planners also encounter a huge challenge to require the understanding of the complex More
        Urban expansion has been a very important topic not only in the management of sustainable development, but also in the fields of remote sensing and geographic information science. Urban planners also encounter a huge challenge to require the understanding of the complex urban growth process, which involves various factors with different patterns of behavior. Modeling of an urban development pattern is the prerequisite to understanding the process and might be the first step in making a decision on urban planning. The main issues of great importance in land use modeling include spatial dynamics, temporal dynamics, incorporation of human drivers of land use changes, and scale dynamics. Dynamic simulation models and empirical estimation models have been used to model land use changes. Rule-based simulation models are mostly suitable for incorporating spatial interaction effects and handling temporal dynamics. However, Cellular Automata (CA) models do not focus on interpretation or understanding of Spatio-temporal processes of urban growth. Most dynamic simulation models cannot incorporate enough socioeconomic variables. Empirical estimation models use statistical techniques to model the relationships between land use changes and the drivers based on historic data. As an empirical estimation method, a regression model has been used in deforestation analysis, agriculture, and urban growth modeling.This paper tends to apply a regression to model urban changes in the old part of the Kermanshah city (Faizabad neighborhood) from 1956 to 2011. In this regard, multi-temporal airborne images were used as a data source. According to the common assumptions, urban physical forms are characterized as the results of a complex deliberation process that involves consideration of many factors. Monitoring of urban fibers transformation through airborne images and translating the obtained data provide a systematic database which can be used in empirical analyses. Applying a grid network for the first time yields the images to quantify obtained results from every cell of the network. In the second step, each cell value recorded for the available temporal period and the Minitab 16 software were used to gain regression equations from these values. The nearest relation between cells value in an observed period was provided by the software as a quadratic equation. Adding an appropriate value to equations gives an estimated amount for its related cell in the selected period of time. The approach was calibrated for 2016 by cross comparing of actual and simulated cells value. In order to examine the modeling precision, the same process was done for 2016 images and cells data were extracted. After the cross comparing, the simulation results were consonant more than 70% with actual data of 2016, which was satisfactory to approve the calibration process. Urban development programs and non-professional interventions in this case study area caused to more disparities and dismantle the logic of the model. Simplicity and the easiness of the proposed model are main privileges in comparison with the previous ones. In summary, this model can be used as a quick responsible way to predict urban changes in specific areas which give acceptable schematic responses. Manuscript profile
      • Open Access Article

        13 - A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
        Hossein Azizi
        Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted More
        Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009), they proposed a new DEA model to find the most efficient association rule in data mining. Considering several criteria, they created an algorithm for ranking association rules using this model. In the present article, we show that their model only selects an optimistic efficient association rule randomly and it is completely dependent on solution or software, which is used for solving problems. In addition, it shows that their proposed algorithm can only rank optimistic efficient rules randomly and it is not able to rank optimistic non-efficient DMUs. We mention other disadvantages and propose a new approach “DEA with double frontiers” to create a complete ranking of association rules. A numerical example will explain some contents of the paper. Manuscript profile
      • Open Access Article

        14 - Designing a smart algorithm for determining stock exchange signals by data mining
        pantea maleki-moghadam akbar alem-tabriz esmael najafi
        One of the most important problems in modern finance is finding efficient ways to summarize and visualize the stock exchange market. This research proposes a smart algorithm by means of valuable big data that is generated by stock exchange market and different kinds of More
        One of the most important problems in modern finance is finding efficient ways to summarize and visualize the stock exchange market. This research proposes a smart algorithm by means of valuable big data that is generated by stock exchange market and different kinds of methodology to present a smart model.In this paper, we investigate relationships between the data and access to their latent information with an enormous amount of data which has a significant impact on the investor’s decisions. First, extracting technical indicators from different point of the charts based on two groups of stock exchanges like petrochemical and automotive during 1387 to 1396, then analyzing clusters by means of k-means algorithm and data mining methodology. The contributions of this paper are: 1. To create a model with twenty technical indicators in different stock exchange companies and industries.2. To evaluate the proposed model and finally to predict the sales signals at the maximum points which has significant performance and can be predicted with acceptable accuracy. Manuscript profile
      • Open Access Article

        15 - Using Data Mining and Three Decision Tree Algorithms to Optimize the Repair and Maintenance Process
        M. Izadikhah D. Garshasbi
        The purpose of this research is to predict the failure of devices using a data mining tool. For this purpose, at the outset, an appropriate database consists of 392 records of ongoing failures in a pharmaceutical company in 1394, in the next step, by analyzing 9 charact More
        The purpose of this research is to predict the failure of devices using a data mining tool. For this purpose, at the outset, an appropriate database consists of 392 records of ongoing failures in a pharmaceutical company in 1394, in the next step, by analyzing 9 characteristics and type of failure as a database class, analyzes have been used. In this regard, three decision tree algorithms have been used to determine the most important attributes and to determine the effective rules for the failure. Based on the results obtained from the feature selection of all the three used algorithms, the lifetime characteristics of the machine, the name of the machine and the duration until the last failure are recognized as the most important attributes. On this basis, the life of the device has a very special importance. Since the depreciation in the pharmaceutical industry is high, so the life of the devices used in maintenance and repair has a special effect. In this regard, machines with a life span of more than 20 years are subject to high depreciation and failure, and in addition to the usual repairs they need some special repairs. Manuscript profile
      • Open Access Article

        16 - Presentation of a Two Stages Model Based on Data Mining for Evaluation of Common Customers of Bank and Insurance Companies
        Hamidreza Amir hasankhani abbass toloie Alireza poorebrahimi reza radfar
        Exploration of knowledge from database and data mining is one of the most important tools for customer relationship management Which can help the organization to find useful information or their interesting knowledge. Today, banks and insurance companies have numerous a More
        Exploration of knowledge from database and data mining is one of the most important tools for customer relationship management Which can help the organization to find useful information or their interesting knowledge. Today, banks and insurance companies have numerous and extensive databases that contain information about exchanges and other details related to their customers. Valuable business information can be retrieved from these data warehouses. However, support for such analyzes and decision-making will not be possible using traditional reporting languages. Therefore, considering the importance of the common customer’s information of the bank and insurance, they should be analyzed as carefully as possible. In this research, by collecting and analyzing the information of joint customers of the bank and insurance, a methodology based on the data mining is presented to evaluate customers according to their functional indicators in the field of banking and insurance. We will also predict the behavior of new customers by analyzing historical customer behavior using a two-step approach based on unsupervised learning and supervised learning. Manuscript profile
      • Open Access Article

        17 - A hybrid Model on the Basis of Data Envelopment Analysis and Data Mining Techniques to Analyze the Investment Behavior in Stock Exchange: A Real Case Study in Tehran Stock Exchange
        Saiedeh Molla Hosseinagha Kaveh Khalili-Damghani
      • Open Access Article

        18 - Data Mining as an Intangible Model of Information Therapy and Seeking Behaviors in Immune Deficiency Disease Specialists
        Sedigheh Mohammadesmaeil Shiba Kianmehr
        Introduction: This study analyzed the information therapy behavior of immunologists in the country, based on the Cohennon self-organized neural network model. Method: Applied research has been done by descriptive survey method using neural network technique. The tool is More
        Introduction: This study analyzed the information therapy behavior of immunologists in the country, based on the Cohennon self-organized neural network model. Method: Applied research has been done by descriptive survey method using neural network technique. The tool is a researcher made-questionnaire that was distributed among 149 people. Using MATLAB software, specialists based on the main components of clustering research, and then by removing each of the main sub-components,, the most effective and least effective option was determined. Results: Analysis showed in information retrieval skills; 63.75% of the population are in the first cluster with an average of 29.88 and in the second cluster 36.24% with an average score of 30.22, and the most important component is the use of keywords and terms related to the required information. About ways to get information; 22.14% of the population with an average score of 54.36 in the first cluster, 18.12% of individuals with an average of 48.11 in the second cluster, 14.09% with an average of 43.28 in the third cluster, 16.1% with an average of 0.04 49 were in the fourth cluster and 29.53% of the people with an average score of 53.72 were in the fifth cluster, and the most important way to find information was to use electronic information sources. Based on the use of various information services, 46% of people with an average score of 54.85 in the first cluster, 20.66% with an average of 49.38 in the second cluster and 32.66% with an average of 43.08 in the third cluster and the most important component of information therapy services has been familiarity with various sources and information services in the specialized field. Conclusion: Neural clustering of information therapy behaviors of the study population and the resulting information transactions, in addition to resulting in awareness of the needs and information resources required by users, as an accessible and low-cost method that improves the quality of information of immunodeficiency specialists leads to the provision of more effective medical services to patients, provides the necessary basis for anticipating information-oriented arrangements and decisions to meet the needs and information carriers requested by users of medical databases and provides managers and staff This field, and as an effective strategy with the highest level of possible standards, leads to the discovery of the intangible pattern of information seeking behaviors of health users, and teaches the audience to use information media intelligently. Manuscript profile
      • Open Access Article

        19 - Customer Retention Based on the Number of Purchase: A Data Mining Approach
        Sahar Mehregan Reza Samizadeh
      • Open Access Article

        20 - Retaining Customers Using Clustering and Association Rules in Insurance Industry: A Case Study
        R. Samizadeh S. Mehregan
      • Open Access Article

        21 - A Study to Improve the Response in Email Campaigning by Comparing Data Mining Segmentation Approaches in Aditi Technologies
        P. Theerthaana S. Sharad
      • Open Access Article

        22 - Identifying the Factors Affecting Marketing Success at One of the Branches of Tejarat Bank Using Data Mining Techniques
        mehdi ghazanfari aghdas badiee fatemeh moslehi
        Given the competitive market in the banking industry, the importance of customer relationship management in the industry is increasing day by day, one of the most important elements of which is active and effective marketing. Therefore, in this research it has been trie More
        Given the competitive market in the banking industry, the importance of customer relationship management in the industry is increasing day by day, one of the most important elements of which is active and effective marketing. Therefore, in this research it has been tried to identify and investigate the factors affecting the success of marketing activities, taking into account the application of data mining tools. Those factors that make bank customers more willing to take advantage of long-term deposits in the bank. In fact, the results of this study will help to increase the rate of return on direct marketing in the banking industry, which has been less considered in previous studies. To this end, the data set related to the telemarketing campaign conducted at one of the branches of Tejarat Bank in the period from May 2016 to September 2018. This paper is based on the type of applied research in sight of the objective of the research and from the perspective of the methodology of the research, is a mixed type, i.e., both qualitative and quantitative. The decision variable of this problem is the result of the success or failure of the telephone marketing activity. First, for identifying customers using the K-Means clustering algorithm, data are divided into six clusters. In the next step, C5 and CART decision tree algorithms were used to identify the factors affecting the success of the marketing campaign. As a general conclusion of the implementation of the three mentioned algorithms, it can be said that, in comparison with other variables, the time variable of the conversation with the client has the greatest effect on the decision of the person about the opening of the deposit. It should be noted that since the time period and place of data gathering in this research have been limited, the results of this study are not likely to be generalized to the other banks. Manuscript profile
      • Open Access Article

        23 - Providing an optimization model of inventory control costs in Tehran ATMs
        Alireza Agha gholizade sayar hossein shirazi mahdi izadyar mohamad mahdi fattah damavandi
        Since cost management is one of the most important tasks of organizations, cost management of inventory control system of ATMs is also one of the most basic tasks of banks. This article seeks to provide a dynamic and optimal model for controlling inventory costs of ATMs More
        Since cost management is one of the most important tasks of organizations, cost management of inventory control system of ATMs is also one of the most basic tasks of banks. This article seeks to provide a dynamic and optimal model for controlling inventory costs of ATMs, according to the time and place of each device. Therefore, all data, related to the relevant bank in Tehran, which includes 368 ATMs, was used. Investigating the behavior of devices in the three-month period in 1396 has been done. This model has succeeded in learning the existing pattern in big data by clustering statistical data in place and time dimensions, and based on this, the proposed decision tree is able to predict the number of customers to each device. Then, using the cost function for the obtained scenarios, the system costs are determined. The total cost of the system includes the total hold cost of money, shortage cost and orderig cost  for each device. Finally, by providing an optimized inventory control model for each scenario, the total system costs are reduced by an average of 16.5 percent, or 38 million tomans per month. Manuscript profile
      • Open Access Article

        24 - Investigating the Effect of Different Data Clustering Methods on the Accuracy of Models Related to Accounting Estimates by Comparing Traditional and Classical Clustering Methods
        S. Mohsen Salehi Vaziri Jamal Barzaghi Khaneghah
        Today, the use of accounting information estimation is the same as other disciplines because of the lack of access to all information. For this reason, in this research, we tried to study the accuracy of accounting estimation models using different clustering methods to More
        Today, the use of accounting information estimation is the same as other disciplines because of the lack of access to all information. For this reason, in this research, we tried to study the accuracy of accounting estimation models using different clustering methods to determine how different clustering methods increase the accuracy of the desired models and the preferred method Among the different clustering methods, which method can be used to increase the accuracy of the models. The research sample consisted of 99 companies listed in the Tehran Stock Exchange. In order to collect the required data, the financial statements and notes of the 9-year period (2008-2017) were used by the companies. The results of the research showed that the use of different clustering methods increases the accuracy of accounting estimates models in most cases. However, among the clustering methods used in the research, the classic clustering method is a more appropriate method than the method The traditional approach is to increase the accuracy of accounting estimates models. Manuscript profile
      • Open Access Article

        25 - Machine learning clustering algorithms based on Data Envelopment Analysis in the presence of uncertainty
        Reza Ghasempour Feremi Mohsen Rostamy-Malkhalifeh
      • Open Access Article

        26 - Performance Evaluation of M5 Tree Model and Support Vector Regression Methods in Suspended Sediment Load Modeling
        Mohammad Taghi Sattari علی رضازاده جودی Forugh Safdari فراز قهرمانیان
        Sediment transport has always affected the river and civil structures and the lack of knowledge about its exact amount causes high damages. Therefore, it is very important to properly estimate the sediment load in rivers in terms of sediment, erosion and flood control. More
        Sediment transport has always affected the river and civil structures and the lack of knowledge about its exact amount causes high damages. Therefore, it is very important to properly estimate the sediment load in rivers in terms of sediment, erosion and flood control. This study used two new data mining methods including M5 model tree and support vector regression comparing with the classic method of sediment rating curve to estimate the suspended sediment load in Aharchay River. To assess the performance of the used methods, three criteria including the correlation coefficient, root mean square error and mean absolute error were used. Analyzing the sensitivity of models to the input variables, it was found that the variable of flow discharge in the current month had the greatest effect on the amount of suspended sediment load. The results showed the high accuracy of new data mining methods in comparison with the sediment rating curve. Although, both considered data mining methods had more accuracy and less error compared to the conventional sediment rating curve, given the simple understandable linear relationships provided by the M5 model tree, this method is recommended for similar cases. Manuscript profile
      • Open Access Article

        27 - Modelling monthly runoff by using data mining methods based on attribute selection algorithms
        محمدتقی ستاری Ali Rezazadeh Joudi
        Given the importance of catchment basin output flow for surface water management, precise understanding of the relationship between the amount of runoff and climatic parameters such as precipitation and temperature is important. therefore the identification of parameter More
        Given the importance of catchment basin output flow for surface water management, precise understanding of the relationship between the amount of runoff and climatic parameters such as precipitation and temperature is important. therefore the identification of parameters are important in the modeling process.  In this paper, after homogeneity tests have been carried out for monthly precipitation, temperature, and runoff data in the Navroud Catchment Basin in Iran, two combinations of effective factors for runoff are considered according to Relief and Correlation algorithms. A new Relief Algorithm first identifies effective features within a set of data in an orderly manner especially when the amount of available data is low. The new method uses a data-related weight vector average and a threshold value. Applying support vector regression and the nearest neighbor method, monthly runoff was modeled based on the two proposed combinations. The results showed that support vector regression approach which utilizes a radial basis function kernel, yields higher accuracy and lower error than the nearest neighbor method for estimating runoff. The improvement is particularly noticeable for flooding situations. Manuscript profile
      • Open Access Article

        28 - Designing a hybrid intelligent model for predicting the Financial Richness
        fatemeh shahbazadeh ebrahim abbasi Hosein Didehkhani Ali Khozean
        This study aims to present an intelligent model for predicting financial opulence in the security companies as a system that supports the decisions. For this reason, by investigating background of the seventeen numbers of variables as a predictor variable for predicting More
        This study aims to present an intelligent model for predicting financial opulence in the security companies as a system that supports the decisions. For this reason, by investigating background of the seventeen numbers of variables as a predictor variable for predicting the class of financial opulence from valid sources of central security Site G.A.A during the years 1390- 1395 had been extracted. For conducting this investigation, there had been used of the data of Security Industry, during the years 1390 to 1395. In this investigation, first, we compare the results of applying different models of prediction based on Data Mining and in the second stage, we investigate the ranking of predicting algorithms. The finding results of this investigation showed that the financial opulence with acceptable precision is predictable and the extracted model by using the decision tree has very high precision and capability. Manuscript profile
      • Open Access Article

        29 - Explaining the role of personality and structural characteristics of management on the competitiveness in the market of products of companies listed on the Tehran Stock Exchange: Emphasis on data mining models and data envelopment analysis
        Amir Faridnia Mohsen Lotfi Behrooz Eskandarpoor
        This study the effect of a set of personality and structural characteristics of management (including short-sightedness, optimism, conservatism, independence, gender diversity of the board, stability and ability of management) on competitiveness in The market of product More
        This study the effect of a set of personality and structural characteristics of management (including short-sightedness, optimism, conservatism, independence, gender diversity of the board, stability and ability of management) on competitiveness in The market of products of companies listed on the Tehran Stock Exchange has been discussed. This study uses a sample of 144 companies listed on the Tehran Stock Exchange in the period 2007-2007 and using statistical methods such as data envelopment analysis, regression model based on panel data and Data mining methods (neural network model and decision tree) have been performed. Findings of regression methods showed that among the studied variables, short-sightedness, board independence, stability and management ability had a significant effect on competitiveness in the products market. Comparison of the results of regression methods and data mining models showed that among the characteristics of management, only the variables of short-sightedness, ability and robustness of management had a significant effect on competitiveness in the market of products of companies Manuscript profile
      • Open Access Article

        30 - Designing a Hybrid Intelligent Model for Prediction of Stock Price Golden Points
        Mohammad Moshari Hosein Didehkhani Kaveh Khalili Dameghani Ebrahim Abbasi
        The purpose of this research is to provide an intelligent model for prediction of golden points on stock price chart as a decision support system. For conduction of this research, the data of the automotive and parts manufacturing industry during 2001 through to 2016 we More
        The purpose of this research is to provide an intelligent model for prediction of golden points on stock price chart as a decision support system. For conduction of this research, the data of the automotive and parts manufacturing industry during 2001 through to 2016 were used. First, the obtained results from application of different forecasting models based on data mining were compared with each other. Next, the research variables were optimized by genetic algorithm and remodeling took place. The results indicated that the golden points could be predicted with reasonable accuracy and optimization did not enhanced accuracy in all these models, yet it significantly reduced gross error.     Manuscript profile
      • Open Access Article

        31 - Using data mining techniques to measure tax risk of value added taxes
        Mohammad Masihi Ahmad Yaghoobnejad Amirreza Keyghobadi Taghi Torabi
        In this paper using data mining to studied taxpayers risk value added taxes. the importance of assessing the taxpayers risk of value added taxes in order to formulate an effective plan for choosing taxpayers for tax audit with the goal of increasing efficiency and effec More
        In this paper using data mining to studied taxpayers risk value added taxes. the importance of assessing the taxpayers risk of value added taxes in order to formulate an effective plan for choosing taxpayers for tax audit with the goal of increasing efficiency and effectiveness, in the country's value added taxes system. In this research taxpayers are catogorized into three, risk_free , low_ risk and risk _averse groups. To assess tax risk two techniques, data mining machin backup vector and logistic regression have been used. The research community consist of large legal entities in Tehran.that wich have been subject to tax audit in value added taxes system in 2012 to 2015. In this research, variables are include corporate governance mechanisms, special corporate features, the nature of the activity of the pioneers of the control system and tax ratios wich are used to train and use the model. The research's results show two techniques LSVM ,Logistic, have a reliability of 70percent and a kind of integration into the results of these two techniques has been achieved nearly 83 percent of reliability has a higher potential. Manuscript profile
      • Open Access Article

        32 - انتخاب سهام با روش ترکیبی تحلیل پوششی داده ها و الگوریتم رقابت استعماری
        F. Faezy Razi
        مقاله حاضرچهارچوب جدیدی برای ایجاد یک پرتفوی جدید از سهام ارایه می دهد . در این مقاله بحث می شود که چگونه یک پرتفوی بهینه ی سهام از طریق رویکرد حاضر در مقایسه با روش های قبلی طراحی می شود.در این مقاله ، پرتفوی سرمایه گذاری بر اساس الگوریتم داده کاوی چید بر مبنای معیار ر More
        مقاله حاضرچهارچوب جدیدی برای ایجاد یک پرتفوی جدید از سهام ارایه می دهد . در این مقاله بحث می شود که چگونه یک پرتفوی بهینه ی سهام از طریق رویکرد حاضر در مقایسه با روش های قبلی طراحی می شود.در این مقاله ، پرتفوی سرمایه گذاری بر اساس الگوریتم داده کاوی چید بر مبنای معیار ریسک تشکیل می شود.در مرحله ی بعد ، دومین پرتفوی سرمایه گذاری بر مبنای قواعد تصمیم استخراج شده بر مبنای مدل DEA-BCC ایجاد می شود.پرتفوی نهایی از طریق یک مدل برنامه ریزی دو هدفه مبتنی بر الگوریتم رقابت استعماری ایجاد می شود.متدولوژی ارایه شده از طریق یک مطالعه موردی در بورس اوراق بهادار تهران بکار گرفته شده است.نتایج الگوریتم چید بر مبنای فیلد خروجی ریسک نشان می دهد که تمام سهام های کاندیدا در یک کلاس قرار نمی گیرد و ضروری است که هر کلاس از سهام های کاندیدا در مقایسه با سایر کلاس های سهام به صورت مستقل مورد ارزیابی قرار بگیرد.نتایج بکارگیری الگوریتم رقابت استعماری در مقیاس های کوچک و متوسط بر مبنای روش تاگوچی نشان می دهد که سهام های مورد مطالعه در روش بکار گرفته شده کالیبره می باشد.بر خلاف سایر روش های انتخاب پرتفوی سهام ،این مقاله ابتدا سهام ها را از طریق الگوریتم چید طبقه بندی می کند . سهام های طبقه بندی شده در هر کلاس به طور مستقل از طریق مدل DEA-BCC موردارزیابی قرار می گیرد.در نهایت پرتفوی بهینه از طریق الگوریتم رقابت استعماری انتخاب می شود. Manuscript profile
      • Open Access Article

        33 - Investigating Management Factors affecting Weed Biodiversity Indices and yield of Wheat Field in Chenaran township Using CART Decision Tree
        setayesh kheradmand Behnam Kamkar javid gherekhloo mohammad hasan hadizadeh ghorbanali rasam
        Investigation on Management Factors affecting Weed Biodiversity Indices and yield of Wheat in Chenaran Township (Iran) Using CART Decision TreeAbstractIn order to study the effect of field management methods and environmental factors on wheat (Triticum aestivum L.) yiel More
        Investigation on Management Factors affecting Weed Biodiversity Indices and yield of Wheat in Chenaran Township (Iran) Using CART Decision TreeAbstractIn order to study the effect of field management methods and environmental factors on wheat (Triticum aestivum L.) yield, weed control, weeds in two consecutive years were studied in 200 farms of 20 villages located in four directions of Chenaran Township, Iran. For this purpose, sampling was carried out using (w) method with a 0.25 m 2 quadrate. Weed species were identified and their number per square meter was determined. Then, the Shannon-Weiner Index and Simpson Equilibrium Index were calculated for biodiversity measurement. Quantitative and qualitative management factors were prepared in the form of farmers' questionnaire. For this purpose, all informations on agronomic management including land area, farmers' history, seedbed preparation and weed control were recorded the forms of a questionnaire during the growing season. At the end of the growing season, the actual yield obtained by the farmers was recorded. The analysis using Classification and Regression Trees (CART) method showed that among different parameters, the agricultural experience, number of dual purpose herbicides (herbicides which control both grasses and broad leaf weeds), nitrogen, potassium, summer planting last two recent years before centrifugal wheat seed planting, farmer age, the Shannon-Weiner and Simpson indices had significant changes. The most important management factors affecting wheat yield were splitting of fertilizer, the number of dual purpose herbicides and fertilizer, as well as rotation and educational levels. The results of this study showed that the appropriate amount of potassium and nitrogen fertilizer and selection of suitable alternatives are effective management strategies to improve wheat yield and increase biodiversity in Chenaran area.Key words: Simpson index, Shannon-Weiner index Manuscript profile
      • Open Access Article

        34 - improving intrusion detection systems by feature reducing based on genetics algorithm and data mining techniques
        Mehdi Keshavarzi hossein Momenzadeh
        The network-based computer systems play critical role in our modern society; so there is highly chance these systems might be target of intrusion and attacks. In order to implement full-scale security in a computer network, firewalls and other intrusion prevention mecha More
        The network-based computer systems play critical role in our modern society; so there is highly chance these systems might be target of intrusion and attacks. In order to implement full-scale security in a computer network, firewalls and other intrusion prevention mechanisms aren’t always enough and needs other systems called intrusion detection systems. An Intrusion detection system can be set of tools, algorithms and evidence that help to identify, locate and report illegal or not approved activities by the network. Intrusion detection systems can be established by software or hardware systems and each have their own advantages and disadvantages. Because of various characteristics of intrusion detection data, in this research we select effective characteristics using improved genetic algorithm. Then by means of standard data mining techniques, we present a model for data classification.For performance evaluation of this suggested method, we used NSL-KDD database that has more realistic records than other intrusion detection data.                                                                           Manuscript profile
      • Open Access Article

        35 - Improving of Diabetes Diagnosis using Ensembles and Machine Learning Methods
        Razieh Asgarnezhad Karrar Ali Mohsin Alhameedawi
      • Open Access Article

        36 - Improving Students' Performance Prediction using LSTM and Neural Network
        Hussam Abduljabar Salim Ahmed Razieh Asgarnezhad
      • Open Access Article

        37 - Comparison of the classification methods in software development effort estimation
        Sadegh Ansaripour Taghi Javdani Gandomani
        Introduction: The main goal of software companies is to provide solutions in various fields to better meet the needs of customers. The process of successful modeling depends on finding the right and accurate requirements. However, the key to successful development for a More
        Introduction: The main goal of software companies is to provide solutions in various fields to better meet the needs of customers. The process of successful modeling depends on finding the right and accurate requirements. However, the key to successful development for adapting and integrating different developed parts is the importance of selecting and prioritizing the requirements that will advance the workflow and ultimately lead to the creation of a quality product. Validation is the key part of the work, which includes techniques that confirm the accuracy of a set of requirements for building a solution that leads to the project's business objectives. Requirements change during the project, and managing these changes is important to ensure the accuracy of the software built for stakeholders. In this research, we will discuss the process of checking and validating the software requirements.Method: Requirement extraction is conducted by means of discovery, review, documentation, and understanding of user needs and limitations of a system. The results are presented in the form of products such as text requirements descriptions, use cases, processing diagrams, and user interface prototypes.Findings: Data mining and recommender systems can be used to increase the necessary needs, however, another method. of social networks and joint filtering can be used to create requirements for large projects to identify needs.Discussion: In the area of ​​product development, requirements engineering approaches focus exclusively on requirement development. There are challenges in the development process due to the existence of human resources. If the challenges are not seen well at this stage, it will be extremely expensive after the software production. Therefore, in this regard, errors should be minimized and they should be identified and corrected as soon as possible. Now, with the investigations carried out, one of the key issues in the field of requirements is the discussion of validation, which first confirms that the requirements are able to be implemented in a set of characteristics according to the system description, and secondly, a set of essential characteristics. such as complete, consistent, according to standard criteria, non-contradiction of requirements, absence of technical errors, and lack of ambiguity in requirements. In fact, the purpose of validation is to ensure the result that a sustainable and renewable product is created according to the requirements.  Manuscript profile
      • Open Access Article

        38 - Use data mining to identify factors affecting students' academic failure
        Mahmood Najafi Mehdi Afzali Mahmood Moradi
        Knowledge extraction is one of the most significant problems of data mining. The principles raised in if-then format can be turned into real numbers in each section- as values which could be included in dataset. The suggested method in the present dissertation is applic More
        Knowledge extraction is one of the most significant problems of data mining. The principles raised in if-then format can be turned into real numbers in each section- as values which could be included in dataset. The suggested method in the present dissertation is application of decision tree algorithms, clustering and forum rules for extraction of final rules. In the suggested method, extraction of rules is defined as an optimization problem and objective was obtaining a rule of high confidence, generalization and understandability. The suggested algorithm for extraction of rules was obtained from and tested based on a dataset of educational failure of 256 art school students living in Zanjan. The results suggested that the j48 algorithm in decision tree and accuracy of 0.95 is the choice for the dataset of educational failure. Data clustering was done by K-Main algorithm with confidence coefficient of 0.95. After all, obtaining rules of high confidence coefficient was done based on forum rules from Apriori algorithm for the whole datasets. The results of present study could be used for inhibition of educational failure of students, improved quality of relationship of parents and authorities with students and enhancing the education they receive. Manuscript profile
      • Open Access Article

        39 - Designing a hybrid model for classification of imbalanced data in the field of third party insurance
        Mahnaz Manteqipour parisa Rahimkhani
        The major part of Iran's insurance industry portfolio is the field of compulsory civil liability insurance of motor vehicle owners against third parties. Therefore, detecting the behavior of this insurance field will be effective in order to provide better services to t More
        The major part of Iran's insurance industry portfolio is the field of compulsory civil liability insurance of motor vehicle owners against third parties. Therefore, detecting the behavior of this insurance field will be effective in order to provide better services to the customers of the insurance industry. Predicting the claim rates for insurance policies, based on the features saved for each insurance policy, is one of the problems of the insurance industry that can be solved with the help of data mining techniques. Insurance is designed using the law of large numbers. In simpler words, a sufficient number of insurance policies are issued, and a small part of this number of insurance policies deal with claims. From the sum of the issued insurance premiums, the cost of claims will be compensated. Therefore, the insurance industry is faced with imbalanced data. The imbalances of insurance industry data causes many challenges in data classification. In the field of third-party insurance and in the data set of this research, there are 14 features for every policies and the data imbalance ratio is 1 to 0.0092, which is considered severe imbalanced.MethodIn this research, we deal with the classification of severe imbalanced data in the field of third party insurance. To overcome the problem of imbalanced data, two hybrid models with different architectures based on 5 basic Gaussian Bayes models, support vectors, logistic regression, decision tree and nearest neighbor are designed. First proposed hybrid model is using random sampling from whole dataset and applying a resampling method for classification and second one selects samples from each labels separately and apply a classification model on the whole selected data. The results of these models are compared. ResultsThe obtained results show that the proposed hybrid models can predict the occurrence or non-occurrence of traffic accidents better than other data mining algorithms. The popular measures such as precisions and recalls of two proposed hybrid models show that second hybrid model has higher performance. And in ensemble phase, the number of models in simple voting as a hyper parameter can be adjusted based on the company's strategy. Also, the use of decision tree to ensemble basic models to build a combined model provides better results than simple voting of basic models.DiscussionTo do more research on the problem of imbalance data classification more complicated resampling data algorithms could be applied and the results be compared. Manuscript profile
      • Open Access Article

        40 - A new algorithm for data clustering using combination of genetic and Fireflies algorithms
        Mahsa Afsardeir mansoure Afsardeir
        Introduction: With the progress of technology and increasing the volume of data in databases, the demand for fast and accurate discovery and extraction of databases has increased. Clustering is one of the data mining approaches that is proposed to analyze and interpret More
        Introduction: With the progress of technology and increasing the volume of data in databases, the demand for fast and accurate discovery and extraction of databases has increased. Clustering is one of the data mining approaches that is proposed to analyze and interpret data by exploring the structures using similarities or differences. One of the most widely used clustering methods is the k-means. In this algorithm, cluster centers are randomly selected and each object is assigned to a cluster that has maximum similarity to the center of that cluster. Therefore, this algorithm is not suitable for outlier data since this data easily changes centers and may produce undesirable results. Therefore, by using optimization methods to find the best cluster centers, the performance of this algorithm can be significantly improved. The idea of combining firefly and genetics algorithms to optimize clustering accuracy is an innovation that has not been used before.Method: In order to optimize k-means clustering, in this paper, the combined method of genetic algorithm and firefly worm is introduced as the firefly genetic algorithm.Findings: The proposed algorithm is evaluated using three well-known datasets, namely, Breast Cancer, Iris, and Glass. It is clear from the results that the proposed algorithm provides better results in all three datasets. The results confirm that the distance between clusters is much less than the compared approaches.Discussion and Conclusion: The most important issue in clustering is to correctly determine the cluster centers. There are a variety of methods and algorithms that performs clustering with different performance. In this paper, based on firefly metaheuristic algorithms and genetic algorithms a new method has been proposed for data clustering. Our main focus in this study was on two determining factors, namely the distance within the data cluster (distance of each data to the center of the cluster) and the distance that the headers have from each other (maximum distance between the centers of the clusters). In the k-means algorithm, clustering is not accurate since the cluster centers are selected randomly. Employing firefly algorithms and genetics, we try to obtain more accurate centers of the clusters and, as a result, correct clustering. Manuscript profile
      • Open Access Article

        41 - A Novel Semi-Supervised Approach for Improving efficiency of E-Learning
        FARHAD GHAREBAGHI Ali Amiri
        Introduction: Academic success of students is one of the important goals in educational environments. In recent years, with the rapid proliferation of information technology in the field of education has led to a new model called "electronic learning". Nowadays, the app More
        Introduction: Academic success of students is one of the important goals in educational environments. In recent years, with the rapid proliferation of information technology in the field of education has led to a new model called "electronic learning". Nowadays, the application of this technology in education has made the education process go beyond time, geographical and political limitations. Providing effective training is an example benefit of e-learning. Considering an appropriate teaching method improves the performance of learners in the educational environment. In traditional education, there is a direct interaction between the learner and the teacher, which makes it possible for the teacher to adapt the teaching method to the learner's conditions. In electronic education, the absence of this direct interaction demands methods for the personalization of education. In this paper, using the data mining approach, the knowledge level of the learners is determined. This knowledge can be observed and collected by the system. Intelligent educational systems use a model of the learner that represents the level of literacy and skill of the learner in a specific field, and use it to analyze the learner's inputs to the system during educational interactions.Method: In this paper, LP-MLTSVM semi-supervised learning method is used to improve the learning quality and the satisfaction of learners in e-learning. The LP-MLTSVM algorithm is a semi-supervised algorithm developed based on the support vector machine algorithm. The proposed model creates a multi-class classification. In the proposed model, the data of the electronic education center of Semin have been used. This center has about five thousand members, in which e-learning courses are conducted. In this method, all the training data are used in the construction of the model, with the difference that there is no need to label all the samples. In the proposed model, only 20% of the samples are labeled by an expert. 75% of the labeled data were used for training and 25% of them were used for testing the models.Results: To evaluate the effectiveness of using the proposed system in the held course, the criteria of "academic success" and "academic satisfaction" have been used. In each of the groups, at the end of the course, a comprehensive test of 40 questions was presented to the learners for evaluation. The purpose of this test is to measure the academic success of the students in the mentioned groups. This work was done in order to check the effect of the proposed groups in the system. To explore the effect of these groups, the one-way analysis of variance was used and the amount of difference between the results of the groups was checked. In order to check the learners' satisfaction with the methods used in the first, second and third groups, at the end of each session, a questionnaire containing four questions was presented to the learners. These questions have 5 options, option 1 indicates lack of satisfaction and option 5 indicates complete satisfaction with the course. The results show the success of the proposed method.Discussion: In the proposed method, the characteristics of the learners were investigated and suitable characteristics were created in order to predict the class variable which was the knowledge level of the learners. Then the research data was collected and solved using the proposed method. The results of the proposed method and the existing methods were compared with each other, and according to the evaluation criteria, the obtained results show that the proposed method is better. Finally, in order to evaluate, the presented model was used in a virtual course. The results of the course show the success and academic satisfaction of the students of the course with the proposed model. Manuscript profile
      • Open Access Article

        42 - A New Approach to Extract Frequent Conceptual Links from Social Networks
        Mahboobeh Farimani saman Poorsiah Hamid Tabatabaee Hossien Salami
        Conceptual link is a new approach for describing social networks. In this approach, the concealed knowledge in social network is presented using a concise structure called Conceptual view. Main challenge to achieve conceptual view from a social network is extracting fre More
        Conceptual link is a new approach for describing social networks. In this approach, the concealed knowledge in social network is presented using a concise structure called Conceptual view. Main challenge to achieve conceptual view from a social network is extracting frequent conceptual links, which is very time consuming for large networks. In this paper, a new method for extracting frequent conceptual links from social networks is provided where by using the concept of dependency, it is tried to accelerate the process of extracting conceptual links. The proposed method will be able to accelerate this process if there are dependencies between data. Manuscript profile
      • Open Access Article

        43 - Presenting a Model for Financial Reporting Fraud Detection using Genetic Algorithm
        Mahmood Mohammadi Shohreh Yazdani Mohammadhamed Khanmohammadi
      • Open Access Article

        44 - A Combined Model for Prediction of Financial Software Learning Rate based on the Accounting Students’ Characteristics
        Bahareh Banitalebi Dehkordi Hamed Samarghandi Sara Hosseinzadeh Kassani Hamidreza malekhossini
      • Open Access Article

        45 - Computing the Efficiency of Bank Branches with Financial Indexes, an Application of Data Envelopment Analysis (DEA) and Big Data
        Fahimeh Jabbari-Moghadam Farhad Hosseinzadeh Lotfi Mohsen Rostamy-Malkhalifeh Masoud Sanei Bijan Rahmani-Parchkolaei
        In traditional Data Envelopment Analysis (DEA) techniques, in order to calculate the efficiency or performance score, for each decision-making unit (DMU), specific and individual DEA models are designed and resolved. When the number of DMUs are immense, due to an increa More
        In traditional Data Envelopment Analysis (DEA) techniques, in order to calculate the efficiency or performance score, for each decision-making unit (DMU), specific and individual DEA models are designed and resolved. When the number of DMUs are immense, due to an increase in complications, the skewed or outdated, calculating methods to compute efficiency, ranking and …. may not prove to be economical. The key objective of the proposed algorithm is to segregate the efficient units from that of the other units. In order to gain access to this objective, effectual indexes were created; and taken to assist, in regards the DEA concepts and the type of business (under study), to survey the indexes, which were relatively operative. Subsequently, with the help of one of the clustering techniques and the ‘concept of dominance’, the efficient units were absolved from the inefficient ones and a DEA model was developed from an aggregate of the efficient units. By eliminating the inefficient units, the number of units which played a role in the construction of a DEA model, diminished. As a result, the speed of the computational process of the scores related to the efficient units increased. The algorithm designed to measure the various branches of one of the mercantile banks of Iran with financial indexes was implemented; resulting in the fact that, the algorithm has the capacity of gaining expansion towards big data. Manuscript profile
      • Open Access Article

        46 - Mining a Set of Rules for Determining the Waiting Time for Selling Residential Units
        Farshid Abdi Shaghayegh Abolmakarem
      • Open Access Article

        47 - Identification of influencing factors on implementation of smart city plans based on approach of technical and social system
        Ali Safarzadeh Ghasemali Bazaei Mehdi Faghihi
      • Open Access Article

        48 - Parallel Machine Scheduling with Controllable Processing Time Considering Energy Cost and Machine Failure Prediction
        Yousef Rabbani Ali Qorbani Reza Kamran Rad
      • Open Access Article

        49 - Explaining the categories of support vector machine and neural network for Ranking of bank branches
        davod khosroanjom mohamamd elyasi behzad keshanchi Bahare Boobanian shovana abdollahi
        There is a lot of information in the banking industry that is of particular importance in identifying it. The use of data mining techniques not only improves quality but also leads to competitive advantages and market positioning. By using data mining and in order to an More
        There is a lot of information in the banking industry that is of particular importance in identifying it. The use of data mining techniques not only improves quality but also leads to competitive advantages and market positioning. By using data mining and in order to analyze patterns and trends, banks can predict the accuracy of how bank branches are ranked. In this paper, the branches of one of the large commercial banks (number of selected branches 1825 branches and the number of features used 57 features) were performed on real data using support vector machine categories and multi layer perceptron neural network. The evaluation results related to the support vector machine showed that this classifier has lower efficiency for the proposed method. However, the use of neural networks and its combination with PCA showed that it has high performance criteria. Values related to efficiency and accuracy were obtained using neural network with very high accuracy. Manuscript profile
      • Open Access Article

        50 - Identifying the influencing factors in customer churn of Kurdistan Telecommunications Company and presenting models for predicting churn using machine learning algorithms
        vida sadeghi Anvar Bahrampour Seyed Ali Hosseini
        The main sources of income and assets are important for any organization. With this view, companies have started to do more to maintain health. Since in many companies the cost of acquiring a new customer is much higher than actual customer satisfaction, customer churn More
        The main sources of income and assets are important for any organization. With this view, companies have started to do more to maintain health. Since in many companies the cost of acquiring a new customer is much higher than actual customer satisfaction, customer churn has become the main area of evaluation for these companies. Client-facing companies, including those active in the technology industry, are facing a major challenge due to customer attrition. With the rapid development of the telecommunications industry, dropout prediction becomes one of the main activities in gaining a competitive advantage in the market. Predicting customer churn allows operators a period of time to remediate and implement a series of preventative measures before customers migrate to other operators. In this research, a decision support system for predicting and estimating the churn of customers of Kurdistan Telecommunication Company (with 52,900 subscribers) with different data-mining and machine methods (including simple linear regression (SLR), multiple linear regression (MLR). Polynomial regression. (PR), logistic regression, artificial neural networks, Adabust and random forest) are presented. The results of the evaluations carried out on the data set of the Kurdistan Province Telecommunication Company, the high performance of artificial neural network methods with 99.9% accuracy, Adabust with 99.9% accuracy, 100% accuracy and random forest It shows 100% with accuracy. Manuscript profile
      • Open Access Article

        51 - Presenting a New Approach to Improve Effectiveness and Select the Most Critical Equipment Using Data Mining, Fuzzy DEMATEL, FMEA and FTA Approaches
        Mohammad Ehsanifar Nima Hamta Parisa Bolhasani
        Taking attention to reliability and maintenance management conception has substantially significantly increased over the recent decades. This paper proposes and integration of DATA MINING, Fuzzy DEMATEL, and FMEA technique in order to improve the reliability and effecti More
        Taking attention to reliability and maintenance management conception has substantially significantly increased over the recent decades. This paper proposes and integration of DATA MINING, Fuzzy DEMATEL, and FMEA technique in order to improve the reliability and effectiveness of maintenance management in Shazand Petrochemical Company. Firstly, the most critical cluster is selected among final clusters obtained from software by utilizing DATA MINING technique. Secondly, Fuzzy DEMATEL technique is used to identify a collection of most critical and most effective equipment of critical cluster under fuzzy condition. Finally, the FMEA and FTA techniques applied to identify the risks numbers and the main causes of the failure and then the solution will be proposed to solve the problems and improve the system. Manuscript profile
      • Open Access Article

        52 - A New Method for Ranking the Discovered Rules Obtained from Data Mining Using Data Envelopment Analysis
        Hossein Azizi
        Data mining techniques, i.e. extraction of patterns from large databases, are extensively used in business. Many rules may be obtained by these techniques and only a few of them may be considered for implementation due to the limitation of budgets and resources. Evaluat More
        Data mining techniques, i.e. extraction of patterns from large databases, are extensively used in business. Many rules may be obtained by these techniques and only a few of them may be considered for implementation due to the limitation of budgets and resources. Evaluating and ranking attractiveness and usefulness of the association rules is of paramount importance in data mining. In the earlier studies carried out on identifying mentally interesting association rules, most methods required writing information or asking users for explicit differentiation of interesting rules from uninteresting ones. These methods involve detailed calculations and they may even lead to inconsistent conclusions. To solve these problems, this article proposes the application of the double frontiers Data Envelopment Analysis (DEA) Approach for selecting the most effective association rule. In this approach, in addition to the best relative efficiency of each association rule, its worst relative efficiency is considered. Comparing with the traditional DEA, double frontiers DEA Approach is capable of identifying the most efficient association rule correctly and easily. As an advantage, the proposed approach is more efficient than the earlier works in this concern, as far as calculations are concerned. Applicability of our DEA-based method for measuring the efficiency of association rules will be shown by multiple criteria using an example of market basket analysis. Manuscript profile
      • Open Access Article

        53 - Provide a Data Envelopment Analysis/Data Mining Integrated Model for Evaluation of Decision-making Units
        Alireza Alinezhad Javad Khalili
        Efficiency is an important issue for the managers of different companies and organizations, as well as customers who are interested in services of these companies and organizations. The aim of this research is to study the efficiency of pharmaceutical companies accepted More
        Efficiency is an important issue for the managers of different companies and organizations, as well as customers who are interested in services of these companies and organizations. The aim of this research is to study the efficiency of pharmaceutical companies accepted in stock exchange organization using Data Envelopment Analysis (DEA) and then, providing some rules using the decision tree. The Malmquist index partly resolves the problem of inadequacy of the observations by enabling the combination of time-series and cross-sectional observations. This method acts on the basis of moving average and it is of use for finding performance trends of a unit during the time. In this research, regarding the inputs and outputs and using the Malmquist index, efficiency evaluation of 22 pharmaceutical companies accepted in stock exchange organization has been done in the state of constant returns to scale during 2012-2016, and the results obtained were used as the class label of Decision-Making Units (DMUs) which are in fact inputs of decision tree method. Finally, using the decision tree, rules implicit in the data, are extracted. Manuscript profile
      • Open Access Article

        54 - Designing an Intelligent Intrusion Detection System in the Electronic Banking Industry Using Fuzzy Logic
        Adel Jahanbani
      • Open Access Article

        55 - Comparison of information transfer delay in standard Apriori algorithm and improved Apriori algorithm
        Hooman Bavarsad Salehpour Seyed Hamid Seyed Javadi Parvaneh Asghari Mohammad Ebrahim Shiri Ahmad Abadi
      • Open Access Article

        56 - Detected Source-based fake news via Word2vec algorithm
        Hamid Sharifi Heris Jafar Sheykhzadeh
      • Open Access Article

        57 - Reliability Measurement’s in Depression Detection Using a Data Mining Approach Based on Fuzzy-Genetics
        Mohammad Nadjafi Sepideh Jenabi Adel Najafi Ghasem Kahe
      • Open Access Article

        58 - A Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows
        Zahra Malmir Mohammad Hossein Rezvani
      • Open Access Article

        59 - Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
        Rasool Azimi Hedieh Sajedi
      • Open Access Article

        60 - Presenting a data mining model based on sustainable development index in the urban management of Tehran metropolis affected by the covid-19 epidemic.
        Abbas Maleki sadegh abedi Alireza Irajpoor
        By applying the restrictions caused by the Covid-19 pandemic, it seems that changes in the concentrations of pollutants CO, O3, NO, NO2, SO2, PM2.5, PM10 and AQI can be seen in the periods before and after the epidemic. Therefore, the changes of air pollutants and traff More
        By applying the restrictions caused by the Covid-19 pandemic, it seems that changes in the concentrations of pollutants CO, O3, NO, NO2, SO2, PM2.5, PM10 and AQI can be seen in the periods before and after the epidemic. Therefore, the changes of air pollutants and traffic restrictions are investigated as one of the sub-categories of environmental indicators of sustainable urban development in the period of 2018/01/21 to 2022/03/20 in the stations under the supervision of Tehran city. First, the data is collected, processed and cleaned. Machine learning methods including decision tree, random forest, support vector machine, Bayesian network and perceptron neural network are applied to select the effective features using the particle swarm optimization method. Investigations showed that the prediction model using decision tree and random forest had the best performance for both precision and recall criteria. The results of the research showed that the concentration of pollutants in the period of Covid-19 compared to before, is increased in some stations and decreased in others, and also the application of traffic restrictions during the epidemic did not have a significant and noticeable effect in reducing the concentration of air pollutants. Also, by examining the trend of deaths during the epidemic period, it was found that the decrease or increase of pollutants has no significant relationship with the trend of deaths caused by Covid-19. Manuscript profile
      • Open Access Article

        61 - Presenting a model for predicting tax evasion guilds based on Data mining techniques
        Mohammad Ghasemi Sadegh Abedi Ali Mohtashami
        In this research, considering the importance of the topic and deficiency in previous researches, amodel for predicting tax evasion of guilds based on data mining technique is presented. Theanalyzed data includes the review of 5600 tax files of all guilds holding tax cod More
        In this research, considering the importance of the topic and deficiency in previous researches, amodel for predicting tax evasion of guilds based on data mining technique is presented. Theanalyzed data includes the review of 5600 tax files of all guilds holding tax codes in Qazvinprovince during the years 2014-2019. The tax file related to guilds is in five tax groups includethe guild group of owners of notary public offices, the guild group of real estate agencies, theguild group of catering halls, restaurants and related businesses, the guild group ofcommunication services, and the guild group of exhibitions and auto accessories stores andrelated businesses. For modeling, the classification model including the decision tree algorithmwas used. The results indicate that the coverage criterion is 68%, the Kappa criterion is 0.612,which indicates the good performance of the modeler. Also, using the Cross Validationtechnique, the validity of the prediction model was tested in order to more reliably estimate thepercentage of modeling performance. The accuracy criterion equal to 67.79% shows theappropriate reliability for the prediction model. The results of this research can be utilized informulating operational strategies based on data mining to predict the tax evasion of guilds in theprovinces. Manuscript profile
      • Open Access Article

        62 - A hybrid model of network data envelopment analysis and data mining to predict efficiency in the green supply chain of the poultry industry
        Tahereh Torkashvand Fatemeh Saghafi Mohammad Hossein Darvish Motevali nazanin pilevari
        Analyzing and predicting efficiency in industries is very important in order to evaluate the performance of units and plan to improve their performance. Poultry industry, as one of the strategic and complex industries, has taken an important part of the food chain baske More
        Analyzing and predicting efficiency in industries is very important in order to evaluate the performance of units and plan to improve their performance. Poultry industry, as one of the strategic and complex industries, has taken an important part of the food chain basket of the societies, and from this point of view, the analysis of its supply network is very important.In this research, using the network data coverage analysis model and data mining, research gaps in the field of measuring and predicting the efficiency of the green supply chain in the poultry industry have been covered. This research is descriptive and survey in terms of its practical purpose and in terms of the nature of implementation, which is done by developing mathematical models in the field of data coverage analysis.First, using the Delphi approach, the effective indicators were identified and screened based on the opinion of experts, and then a new mathematical model was presented based on the coverage analysis of network data. In the following, the efficiency of 9 chains was evaluated in five years using the alpha cut method based on the enveloping analysis method of fuzzy data. The results showed that the presented model is able to evaluate the efficiency of the green and multi-level supply chain of the poultry industry in consecutive years. Manuscript profile
      • Open Access Article

        63 - Assessing Credit Risk in the Banking System Using Data Mining Techniques
        Nima Hamta Mohammad Ehsanifar Bahareh Mohammadi
        A credit risk is the risk of default on a debt that may arise from a borrower failing to make required payments. The objective of this paper is recognition of the factors that effect on credit risk and presenting a model for prediction of credit risk and legal customer More
        A credit risk is the risk of default on a debt that may arise from a borrower failing to make required payments. The objective of this paper is recognition of the factors that effect on credit risk and presenting a model for prediction of credit risk and legal customer credit ranking that are applicant of Sepah bank facilities in Dezfool city and the method of Clustering, Neural Network and Supporter Vector Machine has been used in the current study. Accordingly necessary investigations have been done on financial and nonfinancial data by means of a simple random sample of 200 legal customers that were applicant of bank facilities. In the this paper, 27 descriptive variable that include financial and nonfinancial variables were investigated and finally available variables 8 effective variables on credit risk were selected by means of bank experts judges that were separated by data collection Clustering method in to some groups (Clusters) in the someway that data in one Cluster were considering other points in other Clusters had more similarity. Also selected variables with 3 layers perceptron Neural Network input vector entered the model and finally by means of Support Vector Machine was presented in order to bank legal customers’ financial operation prediction. The obtained results of Neural Network model and Supporter Machine indicate that Neural Network model has mire efficiency in legal customers’ credit risk prediction and credit ranking. Manuscript profile
      • Open Access Article

        64 - Integrating AHP and data mining for effective retailer segmentation based on retailer lifetime value
        Amin Parvaneh Hossein Abbasimehr Mohammad Jafar Tarokh
      • Open Access Article

        65 - Using the Hybrid Model for Credit Scoring (Case Study: Credit Clients of microloans, Bank Refah-Kargeran of Zanjan, Iran)
        Abdollah Nazari Mohammadreza Mehregan Reza Tehrani
      • Open Access Article

        66 - A Data Mining approach for forecasting failure root causes: A case study in an Automated Teller Machine (ATM) manufacturing company
        Seyedehpardis Bagherighadikolaei Rouzbeh Ghousi Abdolrahman Haeri
      • Open Access Article

        67 - Diabetes detection via machine learning using four implemented spanning tree algorithms
        Yas  Ghiasi Mehdi Seif Barghy Davar PISHVA
        This paper considers an accurate and efficient diabetes detection scheme via machine learning. It uses the science of data mining and pattern matching in its diabetes diagnosis process. It implements and evaluates 4 machine learning classification algorithms, namely De More
        This paper considers an accurate and efficient diabetes detection scheme via machine learning. It uses the science of data mining and pattern matching in its diabetes diagnosis process. It implements and evaluates 4 machine learning classification algorithms, namely Decision tree, Random Forest, XGBoost and LGBM. Then selects and introduces the one that performs the best towards its objective using multi-criteria decision-making methods. Its results reveal that Random Forest algorithm outperformed other algorithms with higher accuracy. It also examines the details of features that have a greater effect on diabetes detection. Considering that diabetes is one of the most deadly, disabling, and costly diseases observed today, its alarmingly increasing rates, and difficulty of its diagnosis because of many vague signs and symptoms, utilization of such approach can help doctors increase accuracy of their diagnosis and treatment schemes. Hence, this paper uses the science of data mining as a tool to gather and analyze existing data on diabetes and help doctors with its diagnosis and treatment process. The main contribution of this paper can therefore be its applied nature to an essential field and accuracy of its pattern recognition via several analytical approaches. Manuscript profile
      • Open Access Article

        68 - Application of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)
        Mohammad Hossein Fazel Zarandi Abolfazl Kazemi
      • Open Access Article

        69 - A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
        Abolfazl Kazemi Elahe Mehrzadegan
      • Open Access Article

        70 - Modelling Customer Attraction Prediction in Customer Relation Management using Decision Tree: A Data Mining Approach
        Abolfazl Kazemi Mohammad Esmaeil Babaei
      • Open Access Article

        71 - Investigating the Productivity of Process Model of Social Work Services and Empowerment Using Data Mining Techniques
        Mehrdad Mohammadzadeh Alamdary Mansour Esmaeilpour Alireza Slambolchi Farhad Soleimanian Gharehchigh
        The pervasive and comprehensive nature of social work and empowerment process in supportive institutions such as Imam Khomeini Relief Committee commences with the identification, acceptance and provision of supportive social work for the deprived destitute and, through More
        The pervasive and comprehensive nature of social work and empowerment process in supportive institutions such as Imam Khomeini Relief Committee commences with the identification, acceptance and provision of supportive social work for the deprived destitute and, through purposeful timely training and empowering, helps the clients develop the ability to earn a living. Hence it is absolutely vital to explore the extent of process productivity by employing innovative techniques that can rectify and reform current processes and do more social justice to the impoverished helping the empowered ones withdraw the support cycle and stand on their own feet. The researchers in the present study were, thus, concerned with the question whether the process model of social work delivery and empowerment in Imam Khomeini Relief Committee was sufficiently productive to achieve the organizational goals or needed to be redesigned. To this end, a questionnaire was administered to collect the research data concerning the extent to which the goals of the model were achieved after its implementation with regard to identification, guidance, acceptance, needs analysis, prioritization, and provision of general and specific services for the needy clients in line with the overall empowerment approach. The data were further incrementally analyzed via the Crisp-Data mining at different stages of data mining. Finally, the compilation of data mining results could predict the accuracy of the model through Rough stood 0.86, Decision Tree 0.80, Bayes theory 0.70 and Artificial Neural Networks 0.85. The proposed model, hence, was proved to enjoy the desired productivity for achieving organizational goals. The findings might be employed by prospective researchers to redesign and optimize productivity models in other Non-Government organizations in line with organizational goals. Manuscript profile
      • Open Access Article

        72 - A New Clustering Algorithm for Productivity in Data Mining: The Case of UCA Data
        Jhila Nasiri Farzin Modarres Khiyabani NIma Azorbaarmir Shotorbani
        Methods of clustering in data mining have dramatically developed in recent years as a result of the crucial need to categorize data leading to the expansion of data mining techniques and enhanced productivity of clustering methods in management and decision making. Whal More
        Methods of clustering in data mining have dramatically developed in recent years as a result of the crucial need to categorize data leading to the expansion of data mining techniques and enhanced productivity of clustering methods in management and decision making. Whale optimization algorithm is a new stochastic global optimization method employed to resolve various problems. We already presented a data clustering method based on Whale optimization algorithm in which the initial solutions are randomly selected. What has made K-mean algorithm a highly popular clustering approaches appealing to many researchers is the simplicity and brevity of the stages involved in the process. The present enquiry aimed at employing K-mean algorithm to improve the capability of Whale optimization clustering and proposing the hybrid KWOA algorithm which can find more accurate clusters. The computational results of running the newly proposed algorithm, along with some well-known clustering algorithms, on real data sets from a well-known machine learning repository underscored the promising performance of the proposed algorithm in terms of the quality and standard deviation of the final solutions.  Manuscript profile
      • Open Access Article

        73 - Saderat Bank customer classification using decision tree Based on customer value
        Hosein Beyorani Mahram Azimi
        Customer satisfaction in today's business environment has gained importance. Many companies to increase profits and customer satisfaction are focused on customer value. Customer relationship Management (CRM) is tools to enhance customer relationships has emerged as the More
        Customer satisfaction in today's business environment has gained importance. Many companies to increase profits and customer satisfaction are focused on customer value. Customer relationship Management (CRM) is tools to enhance customer relationships has emerged as the principal competing companies. Successful customer relationship management in companies start from identifying customer value beacuse Customer value provides important information for the development and management. Techniques such as data mining has led to the development of customer relationship management in new competition areas so that companies can be profitable in business competition. Through data mining -the discovery of hidden knowledge database- organizations can identify valuable customer and predict their future behavior and take useful and knowledge-based decisions. The purpose of this research is to gain the effective features to select valuable customer that can classify customers based on population characteristics and other variables relating to transactions in classes very low profit, low profit, high profit and very high profit. In this study the influence of demographic characteristics including age, education and job level also affect the degree of branch, bank branch location and number of transactions on customer value will be checked. Dependent variable in this research is customer value that is classified into four categories. Statistical population is the all customers have an active checking account with Bank Saderat Iran in Tabriz city. To review the case, the CHAID decision tree data mining algorithms were used.Results showed that the variables age, education level of customer and bank branches have no significant effect on customer value and the number of customer transactions with bank has most effective in identifying the class of customer. Manuscript profile
      • Open Access Article

        74 - Using Fuzzy C-means to Discover Concept-drift Patterns for Membership Functions
        Tzung-Pei Hong Chun-Hao Chen Yan-Kang Li Min-Thai Wu
        People often change their minds at different times and at different places. It is important and valuable to indicate concept-drift patterns in unexpected ways for shopping behaviours for commercial applications. Research about concept drift has been growing in recent ye More
        People often change their minds at different times and at different places. It is important and valuable to indicate concept-drift patterns in unexpected ways for shopping behaviours for commercial applications. Research about concept drift has been growing in recent years. Many algorithms dealt with concept-drift information and detected new market trends. This paper proposes an approach based on fuzzy c-means (FCM) to mine the concept drift of fuzzy membership functions. The proposed algorithm is subdivided into two stages. In the first stage, individual fuzzy membership functions are generated from different training databases by the proposed FCM-based approach. Then, the proposed algorithm will mine the concept-drift patterns from the sets of fuzzy membership functions in the second stage. Experiments on simulated datasets were also conducted to show the effectiveness of the approach. Manuscript profile
      • Open Access Article

        75 - Forecasting Of Tehran Stock Exchange Index by Using Data Mining Approach Based on Artificial Intelligence Algorithms
        Mohammad Mahmoodi Akbar Ghasemi
      • Open Access Article

        76 - Sport Result Prediction Using Classification Methods
        Arash Mazidi Mehdi Golsorkhtabaramiri Naznoosh Etminan
      • Open Access Article

        77 - Identification and Clustering Outsourcing Risks of Aviation Part- Manufacturing Projects in Aviation Industries Organization Using Kmeans Method
        Alireza Abbasi Mehrdad Nikbakht
      • Open Access Article

        78 - Developing a model for predicting student performance on centralized test Based on Data Mining
        mostafa yousefi Tezerjan Esrafil Ala Maryam Mollabagher
        The aim of this study is to provide a model for predicting University of Applied Science & Technology students' scores in centralized exams in the coming semesters of the university. For this purpose, the status of the 19/207 student/ course grades has been studied More
        The aim of this study is to provide a model for predicting University of Applied Science & Technology students' scores in centralized exams in the coming semesters of the university. For this purpose, the status of the 19/207 student/ course grades has been studied in 8 courses in 6 provinces and 28 educational centers, that have been held in an associate's and bachelor's degree level and concurrently across the country in the second semester of 1397-98 And by using the feature selection method, the most effective ones were selected. To clarify the relationships between the selected features and the decision tree model with C5.0 algorithm using SPSS Modeler software, with 10 effective indicators, a model for predicting students' scores in the next semester is presented in the courses approved for the centralized exam. This predictive model can be effective in making the learning process more efficient in the academic system. The results of these models include suggestions for modifying the test process, finding students and centers, and out-of-pattern conditions for further monitoring and identifying centers whose students' average GPAs were high but poor on the centralized test. Manuscript profile
      • Open Access Article

        79 - Eco-Efficiency Evaluation in Two-Stage Network Structure: Case Study: Cement Companies
        Mirpouya Mirmozaffari
      • Open Access Article

        80 - بهبود خوشه بندی خودکار با بکارگیری الگوریتمهای فراابتکاری چند هدفه با ارائه معیار ارزیابی جدید و کاربرد آن در ریسک اعتباری
        مجید محمدی راد مهدی افضلی
      • Open Access Article

        81 - Hargreaves Method Improves Accuracy in Estimating Reference Evapotranspiration Adjustment Weight With the Help of Artificial Neural Network and Decision Tree
        omid mohtarami Mohammad Reza Hosseini Ruhollah Fattahi تیمور سهرابی
        One of the most important components of the hydrological cycle is evapotranspiration which plays an important role in water resource management. In the present study the accuracy of evapotranspiration estimation by Hargreaves method and correction factor K was improved More
        One of the most important components of the hydrological cycle is evapotranspiration which plays an important role in water resource management. In the present study the accuracy of evapotranspiration estimation by Hargreaves method and correction factor K was improved using the neural network and decision tree model M5. This coefficient is the ratio of Penman-Monteith evapotranspiration model is the method of Hargreaves. The data used in this study are the maximum and minimum temperatures and relative humidity in the period 2004-2013 Farokhshahr stations and airports in the region ShahrKord is cold and arid. Network Levenberg-Marquardt training algorithm is designed with a feedforward network and sigmoid tangent function is hidden in layers. Decision tree model was designed to help software WEKA. The results show that the neural network and decision tree model to model good performance, but the performance of the neural network model is more accurate correction factor. The results showed that the correction factor carefully before using the Hargreaves RMSE = 0.90 (Root Mean Square Error) Penman-Monteith than that this value after the correction factor to help RMSE = 0.69 and with the use of neural networks the correction factor to help decision tree to reach RMSE = 0.72. The results showed that after using a correction factor to the improved performance of Hargreaves. Manuscript profile
      • Open Access Article

        82 - Detection of Knowledge Governing on Demographic Characteristics of Customers in Selecting Banks by through using Associative Rules in Data Mining
        Naser Ghabouli Alireza Bafandeh Zendeh Samad Aali
        The purpose of the present study is to explore the dominant knowledge of the demographic characteristics of customers in choosing banks through  using associative rules in data mining. Effective decision-making and learning in a growing and complex world with with More
        The purpose of the present study is to explore the dominant knowledge of the demographic characteristics of customers in choosing banks through  using associative rules in data mining. Effective decision-making and learning in a growing and complex world with with the help of  thinkers and executives is a necessary which also need employing some mechanisms to understand the structures of complex systems and mass data acquisition as well as  knowledge generation to make decisions. Most businesses identify their key customers through a variety of demographic characteristics. Businesses also target their consumers by promoting similar marketing features. Targeting consumers with similar demographic characteristics is useful for maximizing sales and profitability of the business. Banks are no exception to this rule  because they are essential elements of the economy of a country. Data mining solves this problem through  providing methods and software for automating analytics and discovering large and complex data sets. This research was conducted according to CRISP-DM standard and data were collected by questionnaire. Then, the results were converted into a database of ninety sources and after that they were extracted by using SPSS modeler software association rules for each bank. Extraction rules show how changing variables have an effect on other factors and ultimately on achieving goals. Manuscript profile
      • Open Access Article

        83 - Feature Selection And Clustering By Multi-objective Optimization
        Seyedeh Mohtaram Daryabari Farhad Ramezani
      • Open Access Article

        84 - SRV: A Striking Model based on Meta-Classifier for Improving Diagnosis Type 2 Diabetes
        Razieh Asgarnezhad
      • Open Access Article

        85 - NSE: An effective model for investigating the role of pre-processing using ensembles in sentiment classification
        Razieh Asgarnezhad Amirhassan Monadjemi
      • Open Access Article

        86 - A Review of Different Data Mining Techniques in Customer Segmentation
        Tannane Parsa Kord Asiabi Reza Tavoli
      • Open Access Article

        87 - Double Clustering Method in Hiding Association Rules
        Zahra Kiani Abari Mohammad Naderi Dehkordi
      • Open Access Article

        88 - Proposing a Method to Classify Texts Using Data Mining
        Mohammad Rostami Seyed Saeed Ayat Iman Attarzadeh Farid Saghari
      • Open Access Article

        89 - Customer Behavior Mining Framework (CBMF) using clustering and classification techniques
        Farshid Abdi Shaghayegh Abolmakarem
      • Open Access Article

        90 - Application of Kansei engineering and data mining in the Thai ceramic manufacturing
        Chaiwat Kittidecha Koichi Yamada
      • Open Access Article

        91 - Development of an evolutionary fuzzy expert system for estimating future behavior of stock price
        Azam Goodarzi Amirhossein Amiri Shervin Asadzadeh Farhad Mehmanpazir Shahrokh Asadi
      • Open Access Article

        92 - Customer lifetime value model in an online toy store
        B Nikkhahan A Habibi Badrabadi M.J Tarokh
      • Open Access Article

        93 - Reviewing the websites of Tehran Municipality and providing appropriate data mining solutions
        shaysteh shojaei karizaki sudabeh Shapoori hajar zarei
        Objective: The main purpose of this study is to identify and analyze different types of data on the website of Tehran Municipality and to provide appropriate data mining solutions. Method: This research is fundamental and in terms of nature it can be considered analytic More
        Objective: The main purpose of this study is to identify and analyze different types of data on the website of Tehran Municipality and to provide appropriate data mining solutions. Method: This research is fundamental and in terms of nature it can be considered analytical. The data collection method was field and the statistical population of 47 sites were selected among 220 domains of Tehran Municipality and data mining techniques were used for analysis and the source of data collection is web analytics and tools used by Google Analytics. Results: The accuracy of the normal neural network algorithm is equal to 99.25% and the RMS standard of the normal neural network algorithm is equal to 0.159. The accuracy of the decision tree algorithm is 99.80% and the MSI criterion of the decision tree algorithm is 0.003 and finally the RMS criterion of the decision tree algorithm is 0.045. The accuracy of the CNN algorithm is equal to 99.81% and finally the RMS criterion of the CNN algorithm is equal to 0.035. Conclusion: Based on the obtained findings, the DB Scan method is equal to other basic methods for analyzing data of Tehran Municipality websites and has a higher accuracy than other methods. Manuscript profile
      • Open Access Article

        94 - Identifying the Thematic Relationships between the Resources Used By the Users of the Regional Science and Technology Information Center Using the Text Mining Technique
        Khojasteh Shabani asefe asemi
        Objective: The main purpose of the present research was to investigate thematic relationships in the topics of resources used by RICeST users, using text mining techniques. Therefore, it has been attempted to reflect how the thematic relationships are in the information More
        Objective: The main purpose of the present research was to investigate thematic relationships in the topics of resources used by RICeST users, using text mining techniques. Therefore, it has been attempted to reflect how the thematic relationships are in the information resources of users in the RICeST Center, in order to gain access to the required materials through understanding the behavior and feelings of users and clients.Methodology: The research method was based on text mining, which refers to data mining on the text, and text analysis in order to extract quality information from the text. Information access to the full text of articles in scientific-research, scientific-promotional journals, collections of scientific conference and conference articles, English and Persian books formed the statistical population of the research, and all the data obtained from the reporting by RICeST were checked using the census method. Data analysis and text analysis was done by Vianet software, and Python software was used to clean and normalize the data.Results: In order to determine the main view of the most used topics by RICeST users, based on the findings from the obtained data, 21 frequent words (used more than 2000 times in the RICeST database in a two-year interval 2018/02/08 – 2020/02/08).Conclusion: the conclusion was based on the fact that the compilation of the research in the collection of electronic resources of information databases and foresight in the future of this category of resources is useful to the managers of information centers and their users. Manuscript profile
      • Open Access Article

        95 - Determine the most important quantitative and qualitative features of the genus Rubus L. in Iran using Feature Selection and Classification Algorithms
        Mohammad Javad Sheikhzadeh
        The genus Rubus L. (Rosaceae, Rosoideae) includes 750 species. This genus is distributed from Low-TroPical to Sem-Polar region. Eight species and five hybridization varieties were reported in the flora of Iran. Rubus is one of the most challenging genera in flowering pl More
        The genus Rubus L. (Rosaceae, Rosoideae) includes 750 species. This genus is distributed from Low-TroPical to Sem-Polar region. Eight species and five hybridization varieties were reported in the flora of Iran. Rubus is one of the most challenging genera in flowering plants. Due to polyploids, apomixis and hybridization in the genus mentioned bring challenges in Rubus identification based on morphological characters. Collecting quantitative and qualitative data in plant studies is very time consuming and costly. Therefore, many kinds of research have been conducted on variable methods which are so reliable and economy vantage. Data mining has been applied for many purposes, e.g., bio-data analysis. In the current paper, a combination of different feature selection and classification algorithms was used to recognize the distinctive features of the genus Rubus L. Using the Random Forest classification method and the InfoGainAttributeEval feature selection model, we accurately classified it to 94.05 percent with 28 attributes which is the best algorithm in terms of accuracy and when we applied the MLP method and the SymetricalAttributeEval feature selection model, With only four attributes, the accuracy of the classification was obtained by 84.32 percent which is the algorithm with the least number of selected attributes. Four attributes mentioned were selected by most of the algorithms used in this paper. All of these attributes are qualitative and there is no need for laboratory measurement costs to obtain them. So there can be a suitable criterion for identifying key. Manuscript profile
      • Open Access Article

        96 - Application of Satellite Data and Data Mining Algorithms in Estimating Coverage Percent (Case study: Nadoushan Rangelands, Ardakan Plain, Yazd, Iran)
        Zinab Mirshekari Majid Sadeghinia Saeideh Kalantari Maryam Asadi
      • Open Access Article

        97 - Presenting a customer classification Pattern with a combined data mining approach (case study :Hygienic and Cosmetic products Industry )
        omid Bashardoust Ezzatollah Asgharizadeh moHammadAli AfsharKazemi
        Due to the accumulated volume of customer purchasing information and the complexity of competition in the present era, the importance of creating a platform for analyzing up-to-date and accurate customer data, with the aim of creating effective relationships with curren More
        Due to the accumulated volume of customer purchasing information and the complexity of competition in the present era, the importance of creating a platform for analyzing up-to-date and accurate customer data, with the aim of creating effective relationships with current and loyal customers, more than ever for organizations as It has become a competitive advantage. The purpose of this study was to investigate the behavioral patterns of customers buying Hygienic Products in order to classify them based on the WRFM using data mining methods. 65534 samples were collected from the company databases in the period of 1396-1397 among the customers of Tehran province by the available purposeful sampling method, also with the help of SPSS, the amount of WRFM determined according to the opinion of industry experts and then this field had been to other fields in the research and using Clementine software, customers clustering has been done according to 70% of the data; also, in order to evaluate the quality of clustering, the criteria of Gini Score, error percentage, and normalized mutual information were used. The results indicate the high efficiency of the K-Means clustering method with the number of four clusters with purity percentage (0.761) for customer segmentation. Manuscript profile
      • Open Access Article

        98 - A Machine Learning Approach to Detect Energy Fraud in Smart Distribution Network
        Mahdi Emadaleslami Mahmoud-Reza Haghifam
      • Open Access Article

        99 - Comparing the speed and time of association extraction from database with cuckoo search and genetic algorithms
        Payam Abdolmohammadi Roham Farahani
      • Open Access Article

        100 - Efficient Data Mining with Evolutionary Algorithms for Cloud Computing Application
        Hamid Malmir Fardad Farokhi Reza Sabbaghi-Nadooshan
      • Open Access Article

        101 - Predicting the Next State of Traffic by Data Mining Classification Techniques
        S.Mehdi Hashemi Mehrdad Almasi Roozbeh Ebrazi Mohsen Jahanshahi
      • Open Access Article

        102 - Redesigning the Emergency Services Process Model Using Techniques Data Mining and Process Research (Case Study: Imam Khomeini Relief Committee of West Azerbaijan Province)
        Mehrdad Mohammadzadeh Alamdary Mansour Esmaeilpour alireza slambolchi Farhad Soleimanian Gharehchigh
        This study examines the success rate of the implemented model of the process of assistance and empowerment services in the largest support institution in the country and uses modern techniques to redesign the original model. In this study, data related to measure the ac More
        This study examines the success rate of the implemented model of the process of assistance and empowerment services in the largest support institution in the country and uses modern techniques to redesign the original model. In this study, data related to measure the achievement of practical goals by distributing questionnaires were collected among 100 colleagues of the Relief Committee in 21 cities of West Azerbaijan province and using CRISP-DM methodology, various stages based on data mining were performed. In order to answer the main research question, the data related to 4687 clients who had gone through the existing process were examined through process analysis techniques. The results showed that the implemented model met the four qualitative criteria in terms of simplicity of the researcher. But in order to achieve full compliance, generality and accuracy, redesign is required. In this regard, by eliminating the repetition ring in route 3, the initial model was redesigned to fully achieve the practical goals, taking into account the demands of socially vulnerable groups, while honoring and increasing the level of satisfaction of the target community, as a successful model by other people's organizations. The non-governmental organization should be exploited inside and outside the country. Manuscript profile
      • Open Access Article

        103 - Presenting a Model Based on Evaluation of Performance Banks Listed in Tehran Stock Exchange Using Data Mining Approach
        elham adakh arefeh fadaviasghari Mohammad Ebrahim Mohamad Pourzarandi
        With the growth of private banks , financial and credit institutions, competition for better services has increased. Given the importance of the issue, it is necessary to develop a comprehensive model for evaluating banks. Every organization needs to evaluate its perfor More
        With the growth of private banks , financial and credit institutions, competition for better services has increased. Given the importance of the issue, it is necessary to develop a comprehensive model for evaluating banks. Every organization needs to evaluate its performance to understand its strengths and weaknesses, especially in dynamic environments. The issue of performance appraisal is so widespread that even management experts say: "What cannot be evaluated cannot be managed".Banks, like other organizations in Iran, need performance evaluation to provide more diverse and faster services as well as their development. [6]This study aimed to present a model to evaluate the performance of banks listed in Tehran Stock Exchange using data mining approach. In this research, four data mining models of decision tree C5.0, decision tree C4.5, Naive Bayes classifier, and random forest were implemented and compared to evaluat the performance of banks. To this end, 28 financial ratios (e.g., profitability ratios, liquidity, quality management, asset quality, and capital adequacy) in 18 banks of Tehran Stock Exchange during 2014-2017 were selected as independent variables. In addition, the performance of banks in three categories of acceptable, unacceptable, and moderate was selected as the dependent variable of the study. According to the results, the decision tree C5.0 with the accuracy of 94.4% was the most efficient model proposed in this research. Manuscript profile
      • Open Access Article

        104 - Evaluation of Bank Branch Performance using Data mining and Expert System Approach
        Hamid Eslami Nosratabadi Mohammad Jafar Tarokh Alireza Poorebrahimi
        Branches of Bank are one of the most important pillars of digital banking and surveying their performance plays an important role in profitability and achieving the bank's goals. This study evaluates the performance of bank branches using innovative methods. First, impo More
        Branches of Bank are one of the most important pillars of digital banking and surveying their performance plays an important role in profitability and achieving the bank's goals. This study evaluates the performance of bank branches using innovative methods. First, important indicators for evaluating branch performance have been identified. Then, the proposed method for data of bank branches has been implemented in the form of a case study. For this purpose, clustering was done first to separate efficient, semi-efficient and inefficient branches. Then, based on the labels created on the data of the branches, classification algorithms and decision trees were used to extract the rules in the data of efficient, inefficient and semi-efficient branches. In the present study, the proposed model of C5.0 algorithm was used due to obtaining the highest accuracy compared to other algorithms. Finally, based on the extracted rules, an expert system was designed to evaluate the performance of bank branches. Clips software was used to design the expert system. In the bank under study, the average increase percentage of cheap deposits during the period to increase the target balance had the greatest impact on performance. Manuscript profile
      • Open Access Article

        105 - Bankruptcy prediction using hybrid data mining models based on misclassification penalty
        Atiye Torkaman AmirAbbas Najafi
        In recent years, data mining, particularly the support vector machine, has gained considerable interest among investors, managers, and researchers as an effective means of bankruptcy prediction. However, studies indicate that it is highly sensitive to the selection of p More
        In recent years, data mining, particularly the support vector machine, has gained considerable interest among investors, managers, and researchers as an effective means of bankruptcy prediction. However, studies indicate that it is highly sensitive to the selection of parameters and input variables. Hence, the aim of this research is to improve bankruptcy prediction accuracy by combining an advanced support vector machine model with the k-nearest neighbor approach to eliminate erroneous entries. To achieve this, first, by using five financial ratios: current ratio, net profit margin, debt ratio, return on assets, and return of investment from 150 companies listed on the Tehran Stock Exchange during the 10-year period (2010-2019), and k-nearest neighbor algorithm, the training data will be refined. Then, relying on a support vector machine based on classification penalty, a prediction model will be constructed. The parameters will be estimated, and its validity will be assessed using test data. Finally, a comparison will be made between the outcomes of the proposed model and traditional models.The findings demonstrate that the combination of the k-nearest neighbor models and support vector machine reduces the overall prediction error, and the penalty coefficients of the support vector machine exhibit a high level of statistical significance. Manuscript profile
      • Open Access Article

        106 - Mining quantitative association rules with stock trading data using multi-objective Meta heuristic algorithms based on genetic algorithm
        mostafa zandiyeh Sima Mardanlu
        Forecasting stock return is an important financial subject that has attracted researchers’ attention for many years. Investors have been trying to find a way to predict stock prices and to find the right stocks and right timing to buy or sell. Recently, data minin More
        Forecasting stock return is an important financial subject that has attracted researchers’ attention for many years. Investors have been trying to find a way to predict stock prices and to find the right stocks and right timing to buy or sell. Recently, data mining techniques and artificial intelligence techniques have been applied to this area. Association discovery is one of the most common Data Mining techniques used to extract interesting knowledge from large datasets. In this paper, we propose a new multi-objective evolutionary model which maximizes the omprehensibility, interestingness and performance of the objectives in order to mine a set of quantitative association rules from financial datasets, including 10 common indicators of technical analysis. To accomplish this, the model extends the two well-known Multi-objective Evolutionary Algorithms, Non-dominated Sorting Genetic Algorithm II and Non-dominated Ranked Genetic Algorithm, to perform an evolutionary learning of the intervals of the attributes and a condition selection for each rule. Moreover, this proposal introduces an external population and a restarting process to the evolutionary model in order to store all the nondominated rules found and improve the diversity of the rule set obtained. The results obtained over real-world stock datasets demonstrate the effectiveness of the proposed approach. Manuscript profile
      • Open Access Article

        107 - Online Portfolio Selection Using Spectral Pattern Matching
        Matin Abdi amirabbas najafi
        Nowadays, due to the rise of turnover and pace of trading in financial markets, accelerating of analysis and making decision is unavoidable. Humans are unable to analyze big data quickly without behavioral biases. Hence, financial markets tend to apply algorithmic tradi More
        Nowadays, due to the rise of turnover and pace of trading in financial markets, accelerating of analysis and making decision is unavoidable. Humans are unable to analyze big data quickly without behavioral biases. Hence, financial markets tend to apply algorithmic trading in which some techniques like data mining and machine learning are notable. Online Portfolio Selection (OLPS) is one of the most modern techniques in algorithmic trading. OLPS allocates capital to a number of stocks and updates portfolio at the beginning of each period by some techniques. Actually, individual has no role in portfolio selection and the algorithm determines the way of investing in each period. In this article, an algorithm which follows pattern matching principle has been introduced. In pattern matching principle, the portfolio is selected based on identical historical patterns and in this article these patterns are found by spectral clustering in data mining. At the end of article, there is a numerical example which uses the most 20 active stocks in New York Stock Exchange (NYSE) data and its results has been compared with other algorithms in this topic. Manuscript profile
      • Open Access Article

        108 - Presenting a new approach based on association rule to investigating the relationship of the Oil market with global markets
        Reza Khosravi Ehsan Mohammadian Amiri pouria Rezai Seyed Babak Ebrahimi
        In the current era, the study of the relationship between different markets and their impact on each other has become a necessity micro and macro investors. By investigating the relationship between markets, in addition to being able to obtain the required information a More
        In the current era, the study of the relationship between different markets and their impact on each other has become a necessity micro and macro investors. By investigating the relationship between markets, in addition to being able to obtain the required information about the impact of the markets, it can also help him in identifying various risks. According to the importance of the subject, in this paper, we have tried to investigate the relationship between the oil market and the gold, dollar markets, companies and energy funds in the field of energy, using the new associative rules approach and the Apriori algorithm. The use of associative rules explicitly explains the relationships between database fields and the relationships and interrelationships between a large set of data items. The results of this study indicate a direct relationship between the oil market and companies active in the energy sector, and the reverse relationship with the dollar index. There was also no significant relationship between the oil and gold market. Manuscript profile
      • Open Access Article

        109 - Clustering with K-Means Hybridization Ant Colony Optimization (K-ACO)
        Dewi Ratnaningsih
      • Open Access Article

        110 - Predictive Model presentation for Customer Satisfaction from Software Support Services with Data Mining Approach
        babak sohrabi Iman raeesi Samaneh Keshavarzi
        Nowadays, productive or service organizations consider the customer's satisfaction as a significant criterion to assess their work quality. Since almost all the organizations need to compete in different areas including services, giving a high quality service is so impo More
        Nowadays, productive or service organizations consider the customer's satisfaction as a significant criterion to assess their work quality. Since almost all the organizations need to compete in different areas including services, giving a high quality service is so important to achieve a permanent competitive advantage. In order to survive in competitive markets, organizations and companies have to provide high quality customer services. The results of many researches illustrate that the service quality is the necessity for customer's satisfaction. Though, a lot of customer oriented companies have problem in recognizing and evaluating the customers' preferences and they often misunderstand the customers' demands. Because providing a high quality service requires understanding the relationship between the demands of customers and the quality of services provided by company. The organizations and companies which give software service also include this rule. The purpose of this research is to present a model to predict the customer's satisfaction from the provided services , also determine the influence of each effective variable on customer's satisfaction, as well be informed of customer's satisfaction level from provided service by the mentioned company. The proposed study used predictive algorithms such as Regression and Classification on data by Rapid Miner. Finally the method with the highest accuracy and minimum error were selected. In addition, in order to determine the most effective variables in customer's satisfaction, the weighting method was used. In order to make decisions and improve customer satisfaction, the results will be available for managers. Manuscript profile
      • Open Access Article

        111 - The Application of Combined Fuzzy Clustering Model and Neural Networks to Measure Valuably of Bank Customers
        Raheleh Nasiri Sharifi Maryam Rastgarpour
      • Open Access Article

        112 - Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
        Ardavan Rajaei
      • Open Access Article

        113 - Determining a model for evaluating the knowledge management system in order to improve industries with the focus on educational technology and applying data mining concepts
        komeil akbarnezhadbaei mahmoud mohamadi abdollah kouloubandi Mohammad Taghipour
        Customer knowledge management is a combination of knowledge management and customer relationship management, which becomes important for developing the business strategy of the organization and as a result, improving organizational activities and achieving a competitive More
        Customer knowledge management is a combination of knowledge management and customer relationship management, which becomes important for developing the business strategy of the organization and as a result, improving organizational activities and achieving a competitive advantage. With the development and expansion of knowledge and information technology, the project environment has become more complex. The purpose of the research is to determine a model to evaluate the knowledge management system in order to improve industries with the focus on educational technology and the use of data mining concepts. Quantitative research method and the required information of this research were collected from the database of industrial units including the automobile company. The statistical population of the research includes all car buyers during the years 2019 and 2019, and the sampling method was simple random.To collect data from questionnaires and to analyze data, data mining method and kmens algorithms, neural network method and support vector machine were used. The obtained results show that the use of knowledge management and educational technology concepts has been able to predict the types of customers and sales trends in the future and it is expected that the obtained results will be useful in improving industries and manufacturing and production management in the country. Manuscript profile
      • Open Access Article

        114 - Using cluster analysis for data mining in educational research
        Zahra Naghsh Azam Moghaddam
        Data mining is the process of sorting and classification of large Data. This paper introduces cluster analysis as a data mining technique and provide examples of its application in clustering of procrastinators. So introduce the cluster analysis and a variety of approac More
        Data mining is the process of sorting and classification of large Data. This paper introduces cluster analysis as a data mining technique and provide examples of its application in clustering of procrastinators. So introduce the cluster analysis and a variety of approaches, how to choose each step of its implementation in a representative sample presented the sample consists of 200 students of Tehran university (100 boys and 100 girls) that get high score in procrastination on the scale of Solomon and Rothblum (1984) and this sample responded to scale of self regulation by Green and Miller (2004), self-efficacy of Midelton and Midgley (1997), the scale of the failure beliefs Harrington (2005) and irrational beliefs Koopmans et al. (1994). The results indicated that different clusters of procrastinators which with identify groups of procrastinators and the characteristics of each of the groups can be more effective ways to reduce procrastination each group presented. Manuscript profile
      • Open Access Article

        115 - Modeling the Application of Knowledge Management System in Order to Improve the Technology Governance in the Automotive Industry of Iran Using the Data Mining Environment
        Komeil Akbarnezhadbaei Mahmood Mohammadi Abdollah Kouloubandi Mohammad Taghipour
        Introduction: Customer knowledge management is a combination of knowledge management and customer relationship management that is important for developing the organization's business strategy and thus improving organizational activities and achieving competitive advanta More
        Introduction: Customer knowledge management is a combination of knowledge management and customer relationship management that is important for developing the organization's business strategy and thus improving organizational activities and achieving competitive advantage. With the development and expansion of knowledge and information technology, the project environment has become more complex. Methodology: Since knowledge management is one of the most important approaches that has been considered in this regard, the issue of the present study is to investigate the impact of knowledge management and each of its constituent processes (creation, application, sharing and storage of knowledge) on IT governance in the automotive industry.Results and discussion: Due to the fact that the impact of knowledge processes in the project environment is investigated, the method of this research was quantitative and the information required for this research was collected from the database of car companies. Level of car purchase is considered as a dependent variable and information about age, gender, income level, occupation, level of education, car history, marital status is considered as an independent variable. It was analyzed using data mining method and kmens algorithms and neural network method and support vector machine.Conclusion: The obtained results show that the use of knowledge management has been able to predict the types of customers and sales trends in the future and it is expected that the obtained results will be effective in managing the construction and production of the country's automotive industry. Manuscript profile