Sentimental Categorization of Persian News Headlines using Three Machine Learning Techniques Versus Human Categorization
Subject Areas : H.3.8. Natural Language Processing
1 - ELT Department, Alzahra University, Tehran, Iran
Keywords: Machine Learning, Persian, Naïve bayes, Maximum Entropy, headlines, sentiment,
Abstract :
The aim of this paper is to elaborate on an attempt to classify Persian news headlines using machine learning techniques rather than human-based analysis. Three major techniques namely Naïve Bayes, Maximum Entropy and Support Vector Machine were introduced and applied to Persian news headlines. Results were compared with each other as well as the human analysis. It is concluded that these techniques outperform human analysis and one technique (Naïve Bayes) is superior to all the techniques mentioned. It can be concluded from this study that the inclusion of discourse analysis is necessary in order to attain better results since the whole is not necessarily the sum of the parts. It means that what you see in the headline does not necessarily reflect what is mentioned in the news itself. So it is recommended that in future studies, elements from discourse analysis be introduced into these algorithms so that better results can be achieved.