Fake News Detection with Semantic Features and Text Mining

Pranav Bharadwaj; Zongru Shao

doi:10.5121/ijnlc.2019.8302

Volume 8, Number 3

Fake News Detection with Semantic Features and Text Mining

Authors

Pranav Bharadwaj¹ and Zongru Shao², ¹South Brunswick High School, USA, ²Spectronn, USA

Abstract

Nearly 70% of people are concerned about the propagation of fake news. This paper aims to detect fake news in online articles through the use of semantic features and various machine learning techniques. In this research, we investigated recurrent neural networks vs. the naive bayes classifier and random forest classifiers using five groups of linguistic features. Evaluated with real or fake dataset from kaggle.com, the best performing model achieved an accuracy of 95.66% using bigram features with the random forest classifier. The fact that bigrams outperform unigrams, trigrams, and quadgrams show that word pairs as opposed to single words or phrases best indicate the authenticity of news.

Keywords

Text Mining, Fake News, Machine Learning, Semantic Features, Natural Language Processing (NLP)