An Online group feature selection algorithm using mutual information
Subject Areas : Multimedia Processing, Communications Systems, Intelligent Systemsmaryam rahmaninia 1 * , sondos bahadori 2
1 - 1. Assistant Professor, Department of Computer Engineering, Qasreshirin Branch, Islamic Azad University, Qasreshirin, Iran
2 - 2. Assistant Professor, Department of Computer Engineering, Ilam Branch, Islamic Azad University, Ilam, Iran
Keywords: Online Group Feature selection algorithm, feature stream datasets, mutual information,
Abstract :
Introduction: In the area of big data, the dimension of data in many fields are increasing dramatically. To deal with the high dimensions of training data, online feature selection algorithms are considered as very important issue in data mining. Recently, online feature selection methods have attracted a lot of attention from researchers. These algorithms deal with the process of selecting important and efficient features and removing redundant features without any pre-knowledge of the set of features. Despite all the progress in this field, there are still many challenges related to these algorithms. Among these challenges, we can mention scalability, minimum size of selected features, sufficient accuracy and execution time. On the other hand, in many real-world applications, features are entered into the dataset in groups and sequentially. Although many online feature selection algorithms have been presented so far, but none of them have been able to find trade of between these criteria.
Method: In this paper, we propose a group online feature selection method with feature stream using two new measures of redundancy and relevancy using mutual information theory. Mutual information can compute linear and non-linear dependency between the variables. With the proposed method, we try to create a better tradeoff between all the challenges.
Results: In order to show the effectiveness of the proposed online group feature selection method, a number of experiments have been conducted on six large multi-label training data sets named ALLAML, colon, SMK-CAN-187, credit-g, sonar and breast-cancer in different applications and 3 online group feature selection algorithms named FNE_OGSFS، Group-SAOLA and OGSFS which are presented recently. Also, 3 evaluation criteria including average accuracy using KNN (k - nearest neighborhood (, SVM (Support Vector Machine) and NB (Naïve Bayesian) classifiers, number of selected features and executing time were used as criteria for comparing the proposed method. According to the obtained results, the proposed algorithm has obtained better results in almost of cases compared to other algorithms which it shows the efficiency of the proposed method.
Discussion: In this paper, we will show that proposed online group feature selection method will achieve better performance by considering label group dependency between the new arrival features.
[1] Z. Fang, J.-N. Hwang, X. Huo, H.-J. Lee, and J. Denzler, "Emergent Techniques and Applications for Big Visual Data," International Journal of Digital Multimedia Broadcasting, vol. 2017, p. 6468502, 2017/10/31 2017, doi: 10.1155/2017/6468502.
[2] T. Zhang and B. Yang, "Big data dimension reduction using PCA," in 2016 IEEE international conference on smart cloud (SmartCloud), 2016: IEEE, pp. 152-157.
[3] M. Hariri and H. Najafy, "Improve the Quality of Mammogram Images by Image Processing Techniques," Intelligent Multimedia Processing and Communication Systems (IMPCS), vol. 3, no. 1, pp. 57-69, 2022. [Online]. Available: https://impcs.zanjan.iau.ir/article_696998_4bc75e4c5155a48f9fb7b94ac977a892.pdf.
[4] M. Köppen, "The curse of dimensionality," in 5th online world conference on soft computing in industrial applications (WSC5), 2000, vol. 1, pp. 4-8.
[5] M. Najafi, M. Afzali, and M. Moradi, "Use data mining to identify factors affecting students' academic failure," Intelligent Multimedia Processing and Communication Systems (IMPCS), vol. 2, no. 1, pp. 23-33, 2021. [Online]. Available: https://impcs.zanjan.iau.ir/article_682093_7de40314874a6374a5cdc7b20b69a455.pdf.
[6] A. Jović, K. Brkić, and N. Bogunović, "A review of feature selection methods with applications," in 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO), 2015: Ieee, pp. 1200-1205.
[7] L. Sun, S. Fu, and F. Wang, "Decision tree SVM model with Fisher feature selection for speech emotion recognition," EURASIP Journal on Audio, Speech, and Music Processing, vol. 2019, no. 1, pp. 1-14, 2019.
[8] Y. Liu, F. Tang, and Z. Zeng, "Feature selection based on dependency margin," IEEE Transactions on Cybernetics, vol. 45, no. 6, pp. 1209-1221, 2014.
[9] S. Ahmed, Y. Lee, S.-H. Hyun, and I. Koo, "Covert cyber assault detection in smart grid networks utilizing feature selection and euclidean distance-based machine learning," Applied Sciences, vol. 8, no. 5, p. 772, 2018.
[10] J. R. Vergara and P. A. Estévez, "A review of feature selection methods based on mutual information," Neural computing and applications, vol. 24, pp. 175-186, 2014.
[11] S. Maldonado and R. Weber, "A wrapper method for feature selection using support vector machines," Information Sciences, vol. 179, no. 13, pp. 2208-2217, 2009.
[12] H. Liu, M. Zhou, and Q. Liu, "An embedded feature selection method for imbalanced data classification," IEEE/CAA Journal of Automatica Sinica, vol. 6, no. 3, pp. 703-715, 2019.
[13] T. N. Lal, O. Chapelle, J. Weston, and A. Elisseeff, "Embedded methods," Feature Extraction: Foundations and Applications, pp. 137-165, 2006.
[14] W. Ding et al., "Subkilometer crater discovery with boosting and transfer learning," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no. 4, pp. 1-22, 2011.
[15] X. Wu, K. Yu, W. Ding, H. Wang, and X. Zhu, "Online feature selection with streaming features," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 5, pp. 1178-1192, 2012.
[16] D. You et al., "Online feature selection for multi-source streaming features," Information Sciences, vol. 590, pp. 267-295, 2022/04/01/ 2022, doi: https://doi.org/10.1016/j.ins.2022.01.008.
[17] D. Paul, A. Jain, S. Saha, and J. Mathew, "Multi-objective PSO based online feature selection for multi-label classification," Knowledge-Based Systems, vol. 222, p. 106966, 2021.
[18] S. Eskandari and M. M. Javidi, "Online streaming feature selection using rough sets," International Journal of Approximate Reasoning, vol. 69, pp. 35-57, 2016.
[19] J. Zhou, D. Foster, R. Stine, and L. Ungar, "Streaming feature selection using alpha-investing," in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 384-393.
[20] S. Perkins, K. Lacker, and J. Theiler, "Grafting: Fast, incremental feature selection by gradient descent in function space," The Journal of Machine Learning Research, vol. 3, pp. 1333-1356, 2003.
[21] M. Wang, H. Li, D. Tao, K. Lu, and X. Wu, "Multimodal Graph-Based Reranking for Web Image Search," IEEE Transactions on Image Processing, vol. 21, no. 11, pp. 4649-4661, 2012, doi: 10.1109/TIP.2012.2207397.
[22] P. Zhou, N. Wang, and S. Zhao, "Online group streaming feature selection considering feature interaction," Knowledge-Based Systems, vol. 226, p. 107157, 2021/08/17/ 2021, doi: https://doi.org/10.1016/j.knosys.2021.107157.
[23] J. Wang et al., "Online feature selection with group structure analysis," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 11, pp. 3029-3041, 2015.
[24] K. Yu, X. Wu, W. Ding, and J. Pei, "Scalable and accurate online feature selection for big data," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 11, no. 2, pp. 1-39, 2016.
[25] T. Dokeroglu, A. Deniz, and H. E. Kiziloz, "A comprehensive survey on recent metaheuristics for feature selection," Neurocomputing, vol. 494, pp. 269-296, 2022/07/14/ 2022, doi: https://doi.org/10.1016/j.neucom.2022.04.083.
[26] U. M. Khaire and R. Dhanalakshmi, "Stability of feature selection algorithm: A review," Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 4, pp. 1060-1073, 2022/04/01/ 2022, doi: https://doi.org/10.1016/j.jksuci.2019.06.012.
[27] M. Vahmiyan, M. Kheirabadi, and E. Akbari, "Feature selection methods in microarray gene expression data: a systematic mapping study," Neural Comput. Appl., vol. 34, no. 22, pp. 19675–19702, 2022, doi: 10.1007/s00521-022-07661-z.
[28] M. H. Nadimi-Shahraki, H. Zamani, and S. Mirjalili, "Enhanced whale optimization algorithm for medical feature selection: A COVID-19 case study," Computers in Biology and Medicine, vol. 148, p. 105858, 2022/09/01/ 2022, doi: https://doi.org/10.1016/j.compbiomed.2022.105858.
[29] M. Zivkovic, C. Stoean, A. Chhabra, N. Budimirovic, A. Petrovic, and N. Bacanin, "Novel Improved Salp Swarm Algorithm: An Application for Feature Selection," Sensors, vol. 22, no. 5, p. 1711, 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/5/1711.
[30] M. M. Javidi and S. Eskandari, "Streamwise feature selection: a rough set method," International Journal of Machine Learning and Cybernetics, vol. 9, pp. 667-676, 2018.
[31] J. Wang, P. Zhao, S. C. Hoi, and R. Jin, "Online feature selection and its applications," IEEE Transactions on knowledge and data engineering, vol. 26, no. 3, pp. 698-710, 2013.
[32] K. Yu, X. Wu, W. Ding, and J. Pei, "Towards scalable and accurate online feature selection for big data," in 2014 IEEE International Conference on Data Mining, 2014: IEEE, pp. 660-669.
[33] M. Rahmaninia and P. Moradi, "OSFSMI: Online stream feature selection method based on mutual information," Applied Soft Computing, vol. 68, pp. 733-746, 2018/07/01/ 2018, doi: https://doi.org/10.1016/j.asoc.2017.08.034.
[34] H. Li, X. Wu, Z. Li, and W. Ding, "Group feature selection with streaming features," in 2013 IEEE 13th International Conference on Data Mining, 2013: IEEE, pp. 1109-1114.
[35] J. Xu, Y. Sun, K. Qu, X. Meng, and Q. Hou, "Online group streaming feature selection using entropy-based uncertainty measures for fuzzy neighborhood rough sets," Complex & Intelligent Systems, vol. 8, no. 6, pp. 5309-5328, 2022.
[36] E. Fix and J. L. Hodges, "Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties," International Statistical Review / Revue Internationale de Statistique, vol. 57, no. 3, pp. 238-247, 1989, doi: 10.2307/1403797.
[37] C. J. C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998/06/01 1998, doi: 10.1023/A:1009715923555.
[38] T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The elements of statistical learning: data mining, inference, and prediction. Springer, 2009.