Volume 13, Number 3

A Data Mining Approach for Filtering out Social Spammers in Large-Scale Twitter Data Collections

  Authors

Waheeda Almayyan and Asaad Alzayed, Collage of Business Studies, PAAET, Kuwait

  Abstract

Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise platform that has become a prevalent and essential component of daily life. Due to its popularity, Twitter draws many spammers or other fake accounts to post malicious links and infiltrate legitimate users' accounts with many spam messages. Therefore, it is crucial to recognize and screen spam tweets and spam accounts. As a result, spam detection is highly needed but still a difficult challenge. This article applied several Bio-inspired optimization algorithms to reduce the features' dimensions in the first stage. Then we used several classification schemes in the second stage to enhance the spam detection rate in three real Twitter data collections. The performance of the chosen classifiers also revealed that Random Forest and C4.5 classifiers achieved the highest Accuracy, Precision, Recall, and F1-score even on class imbalance.

  Keywords

Twitter, spam, machine learning, classification, PSO, Cuckoo, Bat.