Fake News Detection Using Machine Learning: A Comparative Study of Naive Bayes, Logistic Regression, and Linear Support Vector Machine with TF-IDF Features
by Dr. Rajinder Kumar, Er. Sukhwinder Kaur, Piyush
Published: June 18, 2026 • DOI: 10.51584/IJRIAS.2026.11060027
Abstract
The rapid growth of digital misinformation has created an urgent need for computational tools that can identify misleading news content at scale. This paper presents a comparative study of three supervised machine-learning classifiers, Multinomial Naive Bayes, Logistic Regression, and Linear Support Vector Machine (LinearSVC), for binary fake-news classification using TF-IDF text features. The experimental analysis reports values available from the single-split benchmark and dataset description. The cleaned dataset contains 44,898 articles, including 23,481 fake-news articles and 21,417 real-news articles. In the reported 80:20 split, LinearSVC achieves the strongest performance with 99.3% accuracy and approximately 0.99 precision, recall, and F1-score, followed by Logistic Regression at 98.7% accuracy and Multinomial Naive Bayes at 88.5% accuracy. Because very high accuracy on a single dataset may be influenced by dataset-specific lexical or source patterns, the paper discusses reproducibility, explainability, dataset bias, and future external validation requirements before real-world deployment.