Detecting Phishing Attacks Through URL Feature Analysis And Ensemble Learning

Authors

  • Anusha Prakash Hippargi, Aruna M, Adarshana S, Priyanka D L, Ashish Kumar Verma, Samitha Khaiyum, Raksha Kodnad R Author

DOI:

https://doi.org/10.64252/fq5exe91

Abstract

Phishing is a common cybercrime technique that takes advantage of users exposing sensitive data by forging legitimate websites. Due to attackers continuously evolving their methods, traditional rule-based detection methods cannot cope. To address this, this research explores the use of machine learning algorithms in detecting phishing URLs. The study utilizes a dataset of 11,054 URLs with 30 features derived that identify whether a website is genuine or phishing. Various machine learning models including Logistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision Trees, and ensemble methods such as Random Forest, Gradient Boosting, and CatBoost were trained and evaluated. Advanced data preprocessing, data exploration, and feature correlation methods were utilized for enhancing model performance. Models were evaluated based on performance measures such as accuracy, F1-score, recall, and precision. The results show that ensemble models like Random Forest and Gradient Boosting performed the highest accuracy (greater than 97%) in phishing URL classification. The results indicate that a machine learning-based solution, using various behavioral and structural URL features, can offer an effective solution to phishing detection independently and improve cybersecurity defenses.

Downloads

Download data is not yet available.

Downloads

Published

2025-09-08

Issue

Section

Articles

How to Cite

Detecting Phishing Attacks Through URL Feature Analysis And Ensemble Learning. (2025). International Journal of Environmental Sciences, 1835-1841. https://doi.org/10.64252/fq5exe91