×
A Data-Driven Strategy for Online Hate Speech Spreader Identification using Modified Pagerank

Authors

Smita Ghosh and Shiv Jhalani, Santa Clara University, USA

Abstract

Social media platforms have become breeding grounds for the dissemination of misinformation and harmful content, including hate speech. This research paper aims to tackle the urgent problem of hate speech circulation on Online Social Networks by primarily focusing on the early identification of users who are prone to spreading it. To achieve this objective, a novel data-driven metric is introduced, referred to as 'Hate Speech Potential'. Additionally, an innovative approach is proposed that leverages a modified version of the PageRank algorithm, termed the 'Hate Speech Potential Rank' algorithm, to effectively detect and identify malicious users within a network. In a vast network with billions of nodes, the rapid spread of content makes timely detection and mitigation crucial. By assessing a user's past behaviour of sharing or publishing hate speech, their 'Hate Speech Potential' can be determined, enabling the identification of sources and spreaders of such content. The modified PageRank algorithm considers both the user's individual characteristics and the influence of their neighbourhood, thereby capturing a more comprehensive picture of their sharing patterns. A pre-trained machine learning model was employed to accurately classify hate speech posts. By combining the predicted labels and user characteristics and implementing the modified PageRank algorithm, this paper aims to gain deeper insights into the dynamics of information dissemination within a social network, thereby contributing to a better understanding of user sharing behaviour and facilitating the development of effective strategies for addressing hate speech. K-Means clustering was used in experimental evaluations, demonstrating the effectiveness of the proposed approach.

Keywords

Hate Speech Spreader Detection, PageRank, Social Networks, Machine Learning