Smart Phishing Detection on Webpages Using Multi-Agent Deep Learning and Multi-Dimensional Features
Subject Areas : International Journal of Smart Electrical Engineering
Ziba Jafari
1
,
S. Hamid Ghafoori
2
*
,
Mohammad Ahmadinia
3
1 - Department of Computer Engineering, Kerman branch, Islamic Azad University, Kerman, Iran
2 - Department of Computer Engineering, Kerman branch, Islamic Azad University, Kerman, Iran
3 - Department of Computer Engineering, Kerman branch, Islamic Azad University, Kerman, Iran
Keywords: Deep Learning, Phishing, Representation Learning, Multi Agent Deep Reinforcement Learning (MADRL). ,
Abstract :
The increasing sophistication of phishing attacks has made their detection more challenging, as attackers use deceptive tactics to trick users into revealing sensitive information through fraudulent websites. Traditional detection methods struggle to keep up with evolving phishing techniques, necessitating more adaptive approaches. This study introduces a multi-agent deep learning framework that utilizes three specialized models to analyze different aspects of a webpage, including the URL, page content, and Document Object Model structure. The outputs of these models are combined using a highest confidence score mechanism to enhance accuracy. Experimental results demonstrate that this method outperforms existing techniques, achieving 99.21% accuracy with a false positive rate of only 0.22%. It effectively detects both known and new phishing sites, making it a robust solution against emerging threats. Furthermore, this approach highlights the potential of deep reinforcement learning in cybersecurity, paving the way for more automated and resilient security systems to combat phishing attacks.
1. Opara, C., Y. Chen, and B. Wei, Look before you leap: Detecting phishing web pages by exploiting raw URL and HTML characteristics. Expert Systems with Applications, 2024. 236: p. 121183.
2. Wang, M., et al., Phishing webpage detection based on global and local visual similarity. Expert Systems with Applications, 2024. 252: p. 124120.
3. Sánchez-Paniagua, M., et al., Phishing URL detection: A real-case scenario through login URLs. IEEE Access, 2022. 10: p. 42949-42960.
4. Roy, S.S., et al., Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models. Future Internet, 2022. 14(11): p. 340.
5. Purwanto, R.W., et al., PhishSim: Aiding Phishing Website Detection with a Feature-Free Tool. IEEE Transactions on Information Forensics and Security, 2022. 17: p. 1497-1512.
6. Ozcan, A., et al., A hybrid DNN–LSTM model for detecting phishing URLs. Neural Computing and Applications, 2021: p. 1-17.
7. Tang, L. and Q.H. Mahmoud, A Deep Learning-Based Framework for Phishing Website Detection. IEEE Access, 2021. 10: p. 1509-1521.
8. Moghimi, M. and A.Y. Varjani, New rule-based phishing detection method. Expert systems with applications, 2016. 53: p. 231-242.
9. Cao, J., et al., A phishing web pages detection algorithm based on nested structure of earth mover’s distance (Nested-EMD). Chinese Journal of Computers, 2009. 32(5): p. 922-929.
10. Rao, R.S. and A.R. Pais, Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach. Journal of Ambient Intelligence and Humanized Computing, 2020. 11(9): p. 3853-3872.
11. Liang, B., et al. Cracking classifiers for evasion: A case study on the google's phishing pages filter. in Proceedings of the 25th International Conference on World Wide Web. 2016.
12. 1, J.F., et al., Web2Vec: Phishing Webpage Detection Method
Based on Multidimensional Features Driven
by Deep Learning. 2020.
13. Vrbančič, G., I. Fister Jr, and V. Podgorelec. Swarm intelligence approaches for parameter setting of deep learning neural network: case study on phishing websites classification. in Proceedings of the 8th international conference on web intelligence, mining and semantics. 2018.
14. Yang, P., G. Zhao, and P. Zeng, Phishing website detection based on multidimensional features driven by deep learning. IEEE access, 2019. 7: p. 15196-15209.
15. Korkmaz, M., O.K. Sahingoz, and B. Diri. Feature selections for the classification of webpages to detect phishing attacks: a survey. in 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). 2020. IEEE.
16. El Aassal, A., et al., An in-depth benchmarking and evaluation of phishing detection research for security needs. IEEE Access, 2020. 8: p. 22170-22192.
17. Elsadig, M., et al., Intelligent Deep Machine Learning Cyber Phishing URL Detection Based on BERT Features Extraction. Electronics, 2022. 11(22): p. 3647.
18. Chatterjee, M. and A.-S. Namin. Detecting phishing websites through deep reinforcement learning. in 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 2019. IEEE.
19. Feng, J., L. Zou, and T. Nan, A phishing webpage detection method based on stacked autoencoder and correlation coefficients. Journal of computing and information technology, 2019. 27(2): p. 41-54.
20. Basit, A., et al., A comprehensive survey of AI-enabled phishing attacks detection techniques. Telecommunication Systems, 2021. 76(1): p. 139-154.
21. Chen, W., W. Zhang, and Y. Su. Phishing detection research based on LSTM recurrent neural network. in International Conference of Pioneering Computer Scientists, Engineers and Educators. 2018. Springer.
22. Douzi, S., M. Amar, and B. El Ouahidi. Advanced phishing filter using autoencoder and denoising autoencoder. in Proceedings of the International Conference on Big Data and Internet of Thing. 2017.
23. Zhang, X., et al. Boosting the phishing detection performance by semantic analysis. in 2017 ieee international conference on big data (big data). 2017. IEEE.
24. Bu, S.-J. and H.-J. Kim. Learning Disentangled Representation of Web Address via Convolutional-Recurrent Triplet Network for Classifying Phishing URLs. in 2021 International Conference on Electronics, Information, and Communication (ICEIC). 2021. IEEE.
25. Gualberto, E.S., et al., The answer is in the text: multi-stage methods for phishing detection based on feature engineering. IEEE Access, 2020. 8: p. 223529-223547.
26. Xiao, X., et al., Phishing websites detection via CNN and multi-head self-attention on imbalanced datasets. Computers & Security, 2021. 108: p. 102372.
27. Louati, F. and F.B. Ktata, A deep learning-based multi-agent system for intrusion detection. SN Applied Sciences, 2020. 2(4): p. 1-13.
28. Ariyadasa, S., S. Fernando, and S. Fernando, Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML. IEEE Access, 2022. 10: p. 82355-82375.
29. Mohammad, R.M., F. Thabtah, and L. McCluskey. An assessment of features related to phishing websites using an automated technique. in 2012 international conference for internet technology and secured transactions. 2012. IEEE.
30. Louati, F. and F.B. Ktata, A deep learning-based multi-agent system for intrusion detection. SN Applied Sciences, 2020. 2(4): p. 675.
31. Nguyen, T.T., et al., A multi-objective deep reinforcement learning framework. Engineering Applications of Artificial Intelligence, 2020. 96: p. 103915.
32. Nguyen, T.T., N.D. Nguyen, and S. Nahavandi, Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications, in IEEE transactions on cybernetics. 2020. p. 3826-3839.
33. Sartoli, S. and A.S. Namin. A semantic model for action-based adaptive security. in Proceedings of the Symposium on Applied Computing. 2017.
34. Jiang, F., et al., A reinforcement learning-based computing offloading and resource allocation scheme in F-RAN. EURASIP Journal on Advances in Signal Processing, 2021. 2021: p. 1-25.
35. Sutton, R.S. and A.G. Barto, Reinforcement learning: An introduction. 2018: MIT press.
36. Du, W. and S. Ding, A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artificial Intelligence Review, 2021. 54(5): p. 3215-3238.
37. Wang, W., et al., PDRCNN: Precise phishing detection with recurrent convolutional neural networks. Security and Communication Networks, 2019. 2019(1): p. 2595794.
38. PhishTank > See All Suspected Phish Submissions. Accessed:
Oct. 20, 2021. www.phishtank.com.[Online]. Available: https://www.
phishtank.com/phish_archive.php.
39. URL 2016 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. Accessed: Oct. 20, 2021. www.unb.ca. [Online]. Available:
https://www.unb.ca/cic/datasets/url-2016.html.
40. Kumar., S., Malicious and Benign URLs.kaggle.com. 2019.
41. Gupta, N., et al. Data quality for machine learning tasks. in Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 2021.
42. Ariyadasa, S., S. Fernando, and S. Fernando, Detecting phishing attacks using a combined model of LSTM and CNN. International Journal of Advanced And Applied Sciences, 2020. 7(7): p. 56-67.
43. Le, H., et al., URLNet: Learning a URL representation with deep learning for malicious URL detection. arXiv preprint arXiv:1802.03162, 2018.
44. Bahnsen, A.C., et al. Classifying phishing URLs using recurrent neural networks. in 2017 APWG symposium on electronic crime research (eCrime). 2017. IEEE.