Academy & Industry Research Collaboration Center (AIRCC)

Volume 13, Number 02, January 2023

A Novel System for Regional Twitter Hate Speech Analysis and Detection using Deep Learning Models and Web Scraping

  Authors

Nicole Ma1, Yu Sun2, 1Sage Hill School, USA, 2California State Polytechnic University, USA

  Abstract

Instances of hate speech on popular social media platforms such as Twitter are becoming increasingly common and intense. However, there still exists a lack of comprehensive deeplearning models to combat Twitter hate speech. In this project, a comprehensive detection and reporting platform, entitled “TweetWatch,” was created to solve this issue. A binary classification CNN (Convolutional Neural Network) and a multi-class CNN were created to detect hate speech from real-time Twitter data and classify tweets with hate speech into five categories. The binary classification model has an AUC score of 98.95% and an F1 score of 97.88%. The multi-class classification model has an AUC score of 89.46%. All metrics reached over a targeted 5% increase from previous models in multiple papers, validating the proposed solution. Additionally, the only real-time choropleth map for hate speech in the United States was successfully created.

  Keywords

Web scraping, Natural language processing, Deep learning, Neural networks.