Sentiment Analysis Of Product Reviews To Identify Deceptive Rating Information In E-Commerce Websites
DOI:
https://doi.org/10.64252/je7j4v55Keywords:
Sentiment Analysis, Fake Reviews, E-Commerce, Review Mismatch, Deceptive Ratings, Machine Learning, TF-IDF, Gradient Boosting, Review Authenticity, Consumer TrustAbstract
In the rapidly evolving landscape of e-commerce, customer reviews and product ratings play a critical role in influencing consumer purchasing decisions. However, the increasing prevalence of fake or deceptive reviews—where the sentiment expressed in the textual content does not align with the assigned star rating—undermines consumer trust and the credibility of online platforms. This study proposes a sentiment analysis-based framework to identify mismatches between review text sentiment and corresponding ratings to detect potentially fake reviews.
The methodology involves pre-processing review data through tokenization, stop word removal, stemming, and lemmatization. Feature extraction techniques such as TF-IDF, Word2Vec, and One-Hot Encoding are applied to convert text into machine-readable formats. Various supervised machine learning algorithms, including Random Forest, Logistic Regression, Naïve Bayes, and Gradient Boosting, are trained and evaluated using metrics such as accuracy, precision, recall, F1-score, and ROC-AUC
Experimental results on a dataset of 71,000+ reviews demonstrate that Gradient Boosting outperforms other models, achieving the highest accuracy and AUC score in detecting mismatched (potentially deceptive) reviews. This approach not only enhances the reliability of product ratings but also assists customers in making informed decisions while helping e-commerce platforms mitigate the spread of review manipulation.