Academy & Industry Research Collaboration Center (AIRCC)

Volume 12, Number 15, September 2022

WassBERT: High-Performance BERT-based Persian Sentiment Analyzer and Comparison to Other State-of-the-art Approaches

  Authors

Masoumeh Mohammadi and Shadi Tavakoli, Department of Data Science & Machine Learning Telewebion, Iran

  Abstract

Applications require the ability to perceive others' opinions as one of the most outstanding parts of knowledge. Finding the positive or negative feelings in sentences is called sentiment analysis (SA). Businesses use it to understand customer sentiment in comments on websites or social media. An optimized loss function and novel data augmentation methods are proposed for this study, based on Bidirectional Encoder Representations from Transformers (BERT). First, a crawled dataset from Persian movie comments on various sites has been prepared. Then, balancing and augmentation techniques are accomplished on the dataset. Next, some deep models and the proposed BERT are applied to the dataset. We focus on customizing the loss function, which achieves an overall accuracy of 94.06 for multi-label (positive, negative, neutral) sentences. And the comparative experiments are conducted on the dataset, where the results reveal the performance of the proposed model is significantly superior compared with other models.

  Keywords

Bidirectional encoder representations from Transformers (BERT), Bidirectional long short-term memory (Bi-LSTM), Comment classification, Convolutional neural network (CNN), Deep learning, Opinion mining(OM), Natural language processing (NLP), Persian language sentiment classification, Persian Sentiment analysis, Text mining.