• Home
  • Digikala opinions
    • List of Articles Digikala opinions

      • Open Access Article

        1 - The opinion mining of Digikala reviews by semi-supervised support vector machine
        zohre Karimi Hadis Haghiri
        Introduction: The widespread use of the internet and social media platforms has led to an explosion of digital data, including users' opinions about various services and products. These opinions are valuable sources of information for businesses and organizations to und More
        Introduction: The widespread use of the internet and social media platforms has led to an explosion of digital data, including users' opinions about various services and products. These opinions are valuable sources of information for businesses and organizations to understand the needs and preferences of their customers. Supervised machine learning models have been proven to be effective in analyzing users' opinions. However, to achieve efficient results, a sufficient amount of labeled training data is necessary. Labeling data requires a considerable amount of time and resources, which can be a significant challenge for many organizations. This is where the concept of semi-supervised learning comes in, which utilizes both labeled and unlabeled data to improve the performance of the model.Method: In this paper, a semi-supervised approach to analyze users' Persian opinions has been proposed. The method takes advantage of the abundant unlabeled data available in addition to a small number of labeled data in the training phase. The proposed method uses the support vector machine (SVM) algorithm, which has been shown to be effective in opinion mining in related research. The proposed method extracts emotional words from comments using sentiment lexicons and then extracts term frequency-inverse of document frequency vectors. The semi-supervised SVM algorithm is then applied to these vectors to estimate the polarity of sentiments.Results: To evaluate the performance of the proposed method, it has been tested on the Digikala comments dataset and compared with the supervised SVM algorithm and semi-supervised self-training method for different numbers of labeled data based on accuracy, precision, recall, and F1 criteria. The results indicate that the proposed semi-supervised method outperforms the supervised SVM algorithm and the semi-supervised method of self-training. The impact of the size of unlabeled data is also investigated in the experiments.Discussion: One of the advantages of the proposed method is that it can estimate the polarity of opinions that have not been trained in the training phase, which is not possible in some graph-based methods. Furthermore, it is not affected by the error of training with labeled data in self-training methods. In conclusion, the proposed semi-supervised method provides an efficient solution for analyzing users' opinions in Persian. This method can be used by businesses and organizations to gain insights into their customers' opinions and improve their products and services accordingly. Manuscript profile