Distinguishing Ai Generated AND Human Crafted Phishing Emails: A Multi-Modal Machine Learning Approach WITH Adversarial Robustness Assessment

Arnemie B. Gayyed; Natividad B. Concepcion

doi:10.64252/p4ytaj44

Authors

Arnemie B. Gayyed Author
Natividad B. Concepcion Author

DOI:

https://doi.org/10.64252/p4ytaj44

Keywords:

AI-generated phishing email, cybersecurity, multi-modal features, adversarial robustness

Abstract

AI-generated phishing emails pose a significant and evolving cybersecurity threat, rendering conventional detection techniques increasingly insufficient. This challenge reveals a critical research gap, the absence of a comprehensive model capable of accurately identifying sophisticated, AI-powered deceptive tactics. This study proposes a novel multi-modal machine learning framework to classify phishing emails, distinguishing between AI and human origins. A careful integration of diverse features, including textual patterns, content elements like HTML and embedded links, and vital metadata such as sender authentication. The researchers assembled a balanced dataset comprising 2780 human-generated emails from the Nazario corpus and an equivalent number of GPT-4 AI-generated emails. Results demonstrate the high effectiveness of all three feature modalities. Multi-modal models achieved impressive classification performance, with accuracy soaring to 100%. Metadata alone proved exceptionally powerful, yielding near-perfect detection with just 10 features. AI-generated emails showed distinct differences like more punctuation, fewer images, simpler HTML, while human-crafted ones featured longer URLs and more interactive elements. Despite excellent typical performance, models like Logistic Regression proved highly vulnerable to adversarial attacks, with accuracy dropping from 100% to 6% at ϵ=0.5. In summary, this research provides empirical insights by establishing a robust multi-modal framework and critically examining its resilience against adversarial manipulations. These findings underscore the urgent need for smarter, multi-layered cybersecurity defenses to proactively counter escalating AI-driven threats.

Downloads

Download data is not yet available.

Distinguishing Ai Generated AND Human Crafted Phishing Emails: A Multi-Modal Machine Learning Approach WITH Adversarial Robustness Assessment

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Indexing