Rank Based Iterative Relief Technique For Spam Mail Recognition
DOI:
https://doi.org/10.64252/n01xgz62Keywords:
Feature Selection, Relief, ReliefA-D, Iterative ReliefAbstract
Spam emails are unsolicited bulk messages often used for malicious purposes, such as phishing and identity theft. They clutter inboxes, consume bandwidth, and strain resources. Detecting spam is challenging since it frequently shares characteristics with legitimate emails, and irrelevant features can hinder classification accuracy. Feature selection methods eliminate irrelevant features by identifying the most relevant ones. This leads to improved performance, minimized overfitting, reduced dimensionality, and enhanced interpretability. The original Relief algorithm is a feature selection technique that allocates weights to features according to their effectiveness in differentiating instances of distinct classes. Nonetheless, it has drawbacks, including sensitivity to noisy data and challenges in managing redundant features. Relief variants (A–D) improve weight updates and feature dependency management but still struggle with outliers. To enhance spam detection, an iterative version of the Relief algorithm is integrated with the Random Forest classifier. This approach removes irrelevant features and improves efficiency. Evaluation results indicate that this method outperforms the original Relief algorithm and its variants (Relief A–D).