• Home
  • maryam rahmaninia
  • OpenAccess
    • List of Articles maryam rahmaninia

      • Open Access Article

        1 - An Online group feature selection algorithm using mutual information
        maryam rahmaninia sondos bahadori
        Introduction: In the area of big data, the dimension of data in many fields are increasing dramatically. To deal with the high dimensions of training data, online feature selection algorithms are considered as very important issue in data mining. Recently, online featur More
        Introduction: In the area of big data, the dimension of data in many fields are increasing dramatically. To deal with the high dimensions of training data, online feature selection algorithms are considered as very important issue in data mining. Recently, online feature selection methods have attracted a lot of attention from researchers. These algorithms deal with the process of selecting important and efficient features and removing redundant features without any pre-knowledge of the set of features. Despite all the progress in this field, there are still many challenges related to these algorithms. Among these challenges, we can mention scalability, minimum size of selected features, sufficient accuracy and execution time. On the other hand, in many real-world applications, features are entered into the dataset in groups and sequentially. Although many online feature selection algorithms have been presented so far, but none of them have been able to find trade of between these criteria. Method: In this paper, we propose a group online feature selection method with feature stream using two new measures of redundancy and relevancy using mutual information theory. Mutual information can compute linear and non-linear dependency between the variables. With the proposed method, we try to create a better tradeoff between all the challenges. Results: In order to show the effectiveness of the proposed online group feature selection method, a number of experiments have been conducted on six large multi-label training data sets named ALLAML, colon, SMK-CAN-187, credit-g, sonar and breast-cancer in different applications and 3 online group feature selection algorithms named FNE_OGSFS، Group-SAOLA and OGSFS which are presented recently. Also, 3 evaluation criteria including average accuracy using KNN (k - nearest neighborhood (, SVM (Support Vector Machine) and NB (Naïve Bayesian) classifiers, number of selected features and executing time were used as criteria for comparing the proposed method. According to the obtained results, the proposed algorithm has obtained better results in almost of cases compared to other algorithms which it shows the efficiency of the proposed method. Discussion: In this paper, we will show that proposed online group feature selection method will achieve better performance by considering label group dependency between the new arrival features. Manuscript profile