A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
Subject Areas : Data MiningSaman Khalandi 1 , Farhad Soleimanian Gharehchopogh 2
1 - Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmai, Iran.
2 - Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, IRAN
Keywords:
Abstract :
[1] W. Hadi, Q.A. Al-Radaideh, S. Alhawari, Integrating associative rule-based classification with Naïve Bayes for text classification, Applied Soft Computing, Vol. 69, pp. 344-356, 2018.
[2] D. Mahata, R.R. Shah, J. Kuriakose, R. Zimmermann, J.R. Talburt, Theme-Weighted Ranking of Keywords from Text Documents Using Phrase Embeddings, IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE, pp. 184-189, 2018.
[3] A. Kulkarni, V. Tokekar, P. Kulkarni, Discovering Context of Labeled Text Documents Using Context Similarity Coefficient, Procedia Computer Science, Vol. 49, pp. 118-127, 2015
[4] K. Chen, Z. Zhang, J. Long, H. Zhang, Turning from TF-IDF to TF-IGM for term weighting in text classification, Expert Systems with Applications, Vol. 66, pp. 245-260, 2016.
[5] S. Ramanna, J.F. Peters, C. Sengoz, Application of Tolerance Rough Sets in Structured and Unstructured Text Categorization: A Survey, Thriving Rough Sets, Springer, Vol. 708, pp. 119-138, 2017.
[6] A.R. Mehrabian, C. Lucas, A novel numerical optimization algorithm inspired from weed colonization, Ecol. Inform. 1(4): 355-366, 2006.
[7] A. McCallum, K. Nigam, A Comparison of Event Models for Naive Bayes Text Classification, In AAAI-98 workshop on learning for text categorization, Vol. 752, pp. 41-48, 1998.
[8] X. Deng, Y. Li, J. Weng, J. Zhang, Feature selection for text classification: A review, Multimedia Tools and Applications, pp. 1-20, 2018.
[9] M. Rogati, Y. Yang, High-performing variable selection for text classification, in: CIKM ’02 Proceedings of the 11th International Conference on Information and Knowledge Management, pp. 659-661, 2002.
[10] Y. Yang, J.O. Pedersen, A comparative study on feature selection in text categorization, in: The Fourteenth International Conference on Machine Learning (ICML97), pp. 412-420, 1997.
[11] J. Holland, Adaptation in Natural and Artificial Systems, University of Michigan, Michigan, USA, 1975.
[12] J. Kennedy, R. C. Eberhart, Particle Swarm Optimization, In Proceedings of the IEEE International Conference on Neural Networks, pp. 1942-1948, 1995.
[13] A. Trstenjak, S. Mikac, D. Donko, KNN with TF-IDF based Framework for Text Categorization, Procedia Engineering, Vol. 69, pp. 1356-1364, 2014.
[14] Y. Ko, J. Seo, Text classification from unlabeled documents with bootstrapping and feature projection techniques, Information Processing & Management, Vol. 45, Issue 1, pp. 70-83, 2009
[15] D. Ghasempour, F.S.Gharehchopogh, A New Approach for Feature Selection in Text Documents Classification by Using Hybrid Model of Bat and K-Nearest Neighborhood Algorithms, Islamic Azad University, Urmia Branch, Thesis, Summer 2016.
[16] A. Allahvirdipour, F.S. Gharehchopogh, New Approach in Features Selection in Text Documents Classification using the Hybrid Model Algorithms of Naive Bayes and K-Means, Islamic Azad University, Urmia Branch, Thesis, Spring 2016.
[17] R. Habibpour, K. Khalilpour, A New Hybrid K-means and K-Nearest-Neighbor Algorithms for Text Document Clustering, International Journal of Academic Research, Vol. 6 Issue 3, pp. 79-84, 2014
[18] M. Karabulut, Fuzzy unordered rule induction algorithm in text categorization on top of geometric particle swarm optimization term selection, Knowledge-Based Systems, Vol. 54, pp. 288-297, 2013.
[19] A.K. Uysal, S. Gunal, Text classification using genetic algorithm oriented latent semantic features, Expert Systems with Applications, Vol. 41, Issue 13, pp. 5938-5947, 2014
[20] T. Wei, Y. Lu, H. Chang, Q. Zhou, X. Bao, A semantic approach for text clustering using WordNet and lexical chains, Expert Systems with Applications, Vol. 42, Issue 4, pp. 2264-2275, 2015
[21] W. Zhang, X. Tang, T. Yoshida, TESC: An approach to TExt classification using Semi-Supervised Clustering, Knowledge-Based Systems, Vol. 75, pp.152-160, 2015
[22] K.K. Bharti, P.K. Singh, Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering, Applied Soft Computing, Vol. 43, pp. 20-34, 2016.
[23] D. AbuZeina, F.S. Al-Anzi, Employing fisher discriminant analysis for Arabic text classification, Computers & Electrical Engineering, in press, corrected proof, Available online 10 November 2017.
[24] R. Wongso, F.A. Luwinda, B.C. Trisnajaya, O. Rusli, Rudy, News Article Text Classification in Indonesian Language, Procedia Computer Science, Vol. 116, pp. 137-143, 2017.
[25] H.P. Luhn, A Statistical Approach to the Mechanized Encoding and Searching of Literary Information, IBM Journal of Research and Development, Vol. 1, No. 4, pp. 309-317, 1957.
[26] G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, 1989.
[27] R.S. Michalski, I. Bratko, M. Kubat, Machine Learning and Data Mining: Methods and Applications, New York: Wiley, 1998.
[28] D. Francois, Binary classification performances measure cheat sheet, 2009.
[29] C. Blake, C.J. Merz, UCI Repository of Machine Learning Databases [http://www.ics.uci.edu/?mlearn/MLRepository.html], University of California. Department of Information and computer science, Irvine, CA, 1998, pp. 55
[30] http://archive.ics.uci.edu/ml/datasets/Reuters-21578+Text+Categorization+Collection
[31] http://ana.cachopo.org/datasets-for-single-label-text-categorization
[32] A. Onana, S. Korukoglub, H. Bulut, Ensemble of keyword extraction methods and classifiers in text classification, Expert Systems with Applications, Vol. 57, pp. 232-247, 2016.
[33] A.K. Uysal, An improved global feature selection scheme for text classification, Expert Systems with Applications, Vol. 43, pp. 82-92, 2016.
[34] H. Uguz, A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm, Knowledge-Based Systems, Vol. 24, Issue 7, pp. 1024-1032, 2011.
[35] W. Zong, F. Wu, L.K. Chu, D. Sculli, A Discriminative and Semantic Feature Selection Method for Text Categorization, International Journal of Production Economics, Vol. 165, pp. 215-222, 2015.
[36] C. Veenhuis, Binary Invasive Weed Optimization, Second World Congress on Nature and Biologically Inspired Computing (NaBIC), pp. 449-454, 2010.
[37] L.M. Abualigah, A.T. Khader, Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering, The Journal of Supercomputing, Vol. 73, Issue 11, pp. 4773-4795, 2017.