Volume 16, Number 3

A Domain-Invariant Transfer Learning by Bert for Cross-Domain Sentiment Analysis

  Authors

Al-Mahmud and Kazutaka Shimada, Kyushu Institute of Technology, Japan

  Abstract

Sentiment analysis is aimed at analyzing the attitudes or behaviors of individuals and entities from user- generated content. Cross-domain sentiment analysis is the process of examining sentiments expressed in textual data across diverse domains or topics. Unlike traditional sentiment analysis, which concentrates on a particular domain or topic, cross-domain sentiment analysis entails transferring knowledge from trained models on one domain to another domain. It is effective where the labeled data are scarce or unavailable. In this study, our target language is the Bangla language. Numerous traditional machine- learning-based sentiment analysis approaches have been proposed in the Bangla language. They often require a large amount of data to build robust models. However, manual collection/annotation of much training data within the same domain (i.e., domain-specific) can be costly, especially in low-resource languages like Bangla. To address this challenge, we collect publicly available data in one source domain (e.g., drama) by exploiting auxiliary information from it to assist the target domain (e.g., cricket) data/task. Then the model is re-trained and evaluated on the target domain (e.g., cricket) data. We establish various baselines using machine-learning-based and transformer-based models. The baselines are unable to reduce the domain gap between the source and target domains. To this end, we propose a domain-invariant transfer learning approach to bridge the domain gap. We conduct experiments and make comparative analyses between our proposed approach and the baselines. The experimental results demonstrate that the proposed approach outperforms all the baselines and exhibits its efficacy.

  Keywords

Sentiment analysis, Cross-domain sentiment analysis, Source data, Target data, Machine-learning-based models, Transformers-based models, Combined data approach, Stepwise learning approach & Domain- invariant transfer learning approach.