A Review of Feature Selection
Subject Areas : Machine learning
Jafar Abdollahi
1
,
Babak Nouri-Moghaddam
2
,
Naser Mikaeilvand
3
,
Sajjad Jahanbakhsh Gudakahriz
4
,
Ailin Khosravani
5
,
Abbas Mirzaei
6
*
1 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
2 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
3 - Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran
4 - Department of Computer Engineering, Germi Branch, Islamic Azad University, Germi, Iran
5 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
6 - Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
Keywords: Data Mining, Medical Applications, Dimension Reduction, Feature Selection,
Abstract :
Feature selection is a preprocessing technique that identifies the salient features of a given scenario. It has been used in the past for a wide range of problems, including intrusion detection systems, financial problems, and the analysis of biological data. Feature selection has been especially useful in medical applications, where it may help identify the underlying reasons for an illness in addition to reducing dimensionality. We provide some basic concepts of medical applications and the necessary background information on feature selection. We review the most recent feature selection methods developed for and applied to medical problems, covering a broad spectrum of applications including medical imaging, DNA microarray data analysis, and biomedical signal processing. A case study of two medical applications utilizing actual patient data is used to demonstrate the usefulness of applying feature selection techniques to medical challenges and to highlight how these methods function in practical scenarios.
[1] Remeseiro, B., & Bolon-Canedo, V. (2019). A review of feature selection methods in medical applications. Computers in biology and medicine, 112, 103375.
[2] Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517.
[3] Mwadulo, M. W. (2016). A review on feature selection methods for classification tasks.
[4] Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM computing surveys (CSUR), 50(6), 1-45.
[5] Kumar, V., & Minz, S. (2014). Feature selection: a literature review. SmartCR, 4(3), 211-229.
[6] Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.
[7] Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent data analysis, 1(1-4), 131-156.
[8] Miao, J., & Niu, L. (2016). A survey on feature selection. Procedia computer science, 91, 919-926.
[9] Liu, H., & Motoda, H. (Eds.). (2007). Computational methods of feature selection. CRC press.
[10] Koller, D., & Sahami, M. (1996, July). Toward optimal feature selection. In ICML (Vol. 96, No. 28, p. 292).
[11] Venkatesh, B., & Anuradha, J. (2019). A review of feature selection and its methods. Cybernetics and information technologies, 19(1), 3-26.
[12] Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70-79.
[13] Zhao, Z., Morstatter, F., Sharma, S., Alelyani, S., Anand, A., & Liu, H. (2010). Advancing feature selection research. ASU feature selection repository, 1-28.
[14] Saeys, Y., Abeel, T., & Van de Peer, Y. (2008). Robust feature selection using ensemble feature selection techniques. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Antwerp, Belgium, September 15-19, 2008, Proceedings, Part II 19 (pp. 313-325). Springer Berlin Heidelberg.
[15] Abdollahi, J., Moghaddam, B. N., & Parvar, M. E. (2019). Improving diabetes diagnosis in smart health using genetic-based Ensemble learning algorithm. Approach to IoT Infrastructure. Future Gen Distrib Systems J, 1, 23-30.
[16] Abdollahi, J., Keshandehghan, A., Gardaneh, M., Panahi, Y., & Gardaneh, M. (2020). Accurate detection of breast cancer metastasis using a hybrid model of artificial intelligence algorithm. Archives of Breast Cancer, 22-28.
[17] Hosseinalipour, A., KeyKhosravi, D., & Somarin, A. M. (2010, April). New hierarchical routing protocol for WSNs. In 2010 Second International Conference on Computer and Network Technology (pp. 269-272). IEEE.
[18] Abdollahi, J., Irani, A. J., & Nouri-Moghaddam, B. (2021). Modeling and forecasting Spread of COVID-19 epidemic in Iran until Sep 22, 2021, based on deep learning. arXiv preprint arXiv:2103.08178.
[19] Abdollahi, J., & Mahmoudi, L. Investigation of artificial intelligence in stock market prediction studies. In 10th International Conference on Innovation and Research in Engineering Science.
[20] Narimani, Y., Zeinali, E., & Mirzaei, A. (2022). QoS-aware resource allocation and fault tolerant operation in hybrid SDN using stochastic network calculus. Physical Communication, 53, 101709.
[21] Abdollahi, J. (2020). A review of Deep learning methods in the study, prediction and management of COVID-19. In 10th International Conference on Innovation and Research in Engineering Science.
[22] Abdollahi, J., & Mahmoudi, L. (2022, February). An Artificial Intelligence System for Detecting the Types of the Epidemic from X-rays: Artificial Intelligence System for Detecting the Types of the Epidemic from X-rays. In 2022 27th International Computer Conference, Computer Society of Iran (CSICC) (pp. 1-6). IEEE.
[23] Abdollahi, J. (2022, February). Identification of medicinal plants in ardabil using deep learning: identification of medicinal plants using deep learning. In 2022 27th International Computer Conference, Computer Society of Iran (CSICC) (pp. 1-6). IEEE.
[24] Abdollahi, J., & Nouri-Moghaddam, B. (2022). Hybrid stacked ensemble combined with genetic algorithms for diabetes prediction. Iran Journal of Computer Science, 5(3), 205-220.
[25] Abdollahi, J., Davari, N., Panahi, Y., & Gardaneh, M. (2022). Detection of Metastatic Breast Cancer from Whole-Slide Pathology Images Using an Ensemble Deep-Learning Method: Detection of Breast Cancer using Deep-Learning. Archives of Breast Cancer, 364-376.
[26] Abdollahi, J., & Nouri-Moghaddam, B. (2022). A hybrid method for heart disease diagnosis utilizing feature selection based ensemble classifier model generation. Iran Journal of Computer Science, 5(3), 229-246.
[27] Javadzadeh Barzaki, M. A., Negaresh, M., Abdollahi, J., Mohammadi, M., Ghobadi, H., Mohammadzadeh, B., & Amani, F. (2022, July). USING DEEP LEARNING NETWORKS FOR CLASSIFICATION OF LUNG CANCER NODULES IN CT IMAGES. In Iranian Congress of Radiology (Vol. 37, No. 2, pp. 34-34). Iranian Society of Radiology.
[28] [28] Khavandi, H., Moghadam, B. N., Abdollahi, J., & Branch, A. (2023). Maximizing the Impact on Social Networks using the Combination of PSO and GA Algorithms. Future Generation in Distributed Systems, 5, 1-13.
[29] Mehrpour, O., Saeedi, F., Abdollahi, J., Amirabadizadeh, A., & Goss, F. (2023). The value of machine learning for prognosis prediction of diphenhydramine exposure: National analysis of 50,000 patients in the United States. Journal of Research in Medical Sciences, 28(1), 49.
[30] Abdollahi, J. (2023). Evaluating LeNet Algorithms in Classification Lung Cancer from Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases. arXiv preprint arXiv:2305.13333.
[31] Mehrpour, O., Saeedi, F., Vohra, V., Abdollahi, J., Shirazi, F. M., & Goss, F. (2023). The role of decision tree and machine learning models for outcome prediction of bupropion exposure: A nationwide analysis of more than 14 000 patients in the United States. Basic & Clinical Pharmacology & Toxicology, 133(1), 98-110.
[32] Abdollahi, J., NouriMoghaddam, B., & MIRZAEI, A. (2023). Diabetes Data Classification using Deep Learning Approach and Feature Selection based on Genetic.
[33] Tajidini, F., & Kheiri, M. J. (2023). Recent advancement in Disease Diagnostic using machine learning: Systematic survey of decades, comparisons, and challenges. arXiv preprint arXiv:2308.01319.
[34] Zargar, H. H., Zargar, S. H., Mehri, R., & Tajidini, F. (2023). Using VGG16 Algorithms for classification of lung cancer in CT scans Image. arXiv preprint arXiv:2305.18367.
[35] Tajidini, F. (2023). A comprehensive review of deep learning in lung cancer. arXiv preprint arXiv:2308.02528.
[36] Tajidini, F., & Mehri, R. Deep learning in healthcare.
[37] Tajidini, F., & Mehri, R. A survey of using Deep learning algorithms for the Covid-19 (SARS-CoV-2) pandemic: A review.
[38] Tajidini, F., & Piri, M. Machine Learning Methods for prediction of Diabetes: A Narrative.
[39] Mirzaei, A., & Najafi Souha, A. (2021). Towards optimal configuration in MEC Neural networks: deep learning-based optimal resource allocation. Wireless Personal Communications, 121(1), 221-243.
[40] HosseinAlipour, A., KeyKhosravi, D., & Somarin, A. M. (2010). New method to decrease probability of failure nodes in WSNs. IJCNS) International Journal of Computer and Network Security, 2(2).
[41] Nemati, Z., Mohammadi, A., Bayat, A., & Mirzaei, A. (2024). Metaheuristic and Data Mining Algorithms-based Feature Selection Approach for Anomaly Detection. IETE Journal of Research, 1-15.
[42] Javid, S., & Mirzaei, A. (2021). Presenting a Reliable Routing Approach in IoT Healthcare Using the Multiobjective-Based Multiagent Approach. Wireless Communications and Mobile Computing, 2021.
[43] Jahandideh, Y., & Mirzaei, A. (2021). Allocating Duplicate Copies for IoT Data in Cloud Computing based on Harmony Search Algorithm. IETE Journal of Research, 1-14.
[44] Mikaeilvand, N., Ojaroudi, M., & Ghadimi, N. (2015). Band-Notched Small Slot Antenna Based on Time-Domain Reflectometry Modeling for UWB Applications. The Applied Computational Electromagnetics Society Journal (ACES), 682-687.
[45] Mikaeilvand, N. (2011). On solvability of fuzzy system of linear matrix equations. J Appl Sci Res, 7(2), 141-153.
[46] Allahviranloo, T., & Mikaeilvand, N. (2011). Non zero solutions of the fully fuzzy linear systems. Appl. Comput. Math, 10(2), 271-282.
[47] Derakhshandeh, S., & Mikaeilvand, N. (2011). New framework for comparing information security risk assessment methodologies. Australian Journal of Basic and Applied Sciences, 5(9), 160-166.