PHISHING WEBSITE DETECTION USING MACHINE LEARNING
Keywords:
PhishingDetection, MachineLearning, DeepLearning, EnsembleMethods, ArtificialIntelligenc, EmailFi ltering, WebApplicationSecurityAbstract
Criminals seeking sensitive information construct illegal clones of actual websites and e mail accounts. The e-mail will be made up of real firm logos and slogans. When a user clicks on a link provided by these hackers, the hackers gain access to all of the user’s private information, including bank account information, personal login passwords, and images. Random Forest and Decision Tree algorithms are heavily employed in present systems, and their accuracy has to be enhanced. The existing models have low latency. Existing systems do not have a specific user interface. In the current system, different algorithms are not compared. Consumers are led to a faked website that appears to be from the authentic company when the e-mails or the links provided are opened. The models are used to detect phishing Websites based on URL significance features, as well as to find and implement the optimal machine learning model. Logistic Regression, Multinomial Naive Bayes, and XG Boost are the machine learning methods that are compared. The Logistic Regression algorithm outperforms the other two. The goal was to get as many people to click on a link or open an infected file as possible. There are various approaches to detect this type of attack. One of the approaches is machine learning. The URL’s received by the user will be given input to the machine learning model then the algorithm will process the input and display the output whether it is phishing or legitimate. There are various ML algorithms like SVM, Neural Networks, Random Forest, Decision Tree, XG boost etc. that can be used to classify these URLs. The proposed approach deals with the Random Forest, Decision Tree classifiers. The proposed approach effectively classified the Phishing and Legitimate URLs with an accuracy of 87.0% and 82.4% for Random
Downloads
References
A. Firdaus, N. B. Anuar, M. F. A. Razak, and A. K. Sangaiah, “Bio-inspired computational paradigm for feature investigation and malware detection: interactive analytics,” Multimed. Tools Appl., 2017.
[2] Muhammad Taseer Suleman and Shahid Mahmood Awan, “Optimization of URL-Based Phishing Websites Detection through Genetic Algorithms,” Autom. Control Comput. Sci., vol. 53, no. 4, pp. 333–341, 2019.
[3] A. Kulkarni and L. L., “Phishing Websites Detection using Machine Learning,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 7, 2019.
[4] M. Hazim, N. B. Anuar, M. F. Ab Razak, and N. A. Abdullah, “Detecting opinion spams through supervised boosting approach,” PLoS One, vol. 13, no. 6, pp. 1–23, 2018.
[5] PhishMe, “Analysis of Susceptibility, Resiliency and Defense Against Simulate and Real Phishing Attacks,” 2017.
[6] W. S. Cybersecurity, “Nearly 1.5 Million New Phishing Sites Created Each Month,” Webroot Smarter Cybersecurity, 2017. .
[7] APWG, “APWG Phishing Attack Trends Reports,” APWG Unifying Global Response to Cybercrime, 2018. .
[8] R. Gowtham and I. Krishnamurthi, “A comprehensive and efficacious architecture for detecting phishing webpages,” Comput. Secur., vol. 40, pp. 23–37, 2014.
[9] S. G. Selvaganapathy, M. Nivaashini, and H. P. Natarajan, “Deep belief network based detection and categorization of malicious URLs,” Inf. Secur. J., vol. 27, no. 3, pp. 145 161, 2018.
[10] L. McCluskey, F. Thabtah, and R. M. Mohammad, “Intelligent rule-based phishing websites classification,” IET Inf. Secur., vol. 8, no. 3, pp. 153–160, 2014.
[11] A. A. Akinyelu and A. O. Adewumi, “Classification of phishing email using random forest machine learning technique,” J. Appl. Math., vol. 2014, 2014.
[12] M. F. A. Razak, N. B. Anuar, R. Salleh, A. Firdaus, M. Faiz, and H. S. Alamri, “‘Less Give More’: Evaluate and zoning Android applications,” Meas. J. Int. Meas. Confed., vol. 133, pp. 396–411, 2019. [13] M. Akiyama, T. Yagi, T. Yada, T. Mori, and Y. Kadobayashi, “Analyzing the ecosystem of malicious
URL redirection through longitudinal observation from honeypots,” Comput. Secur., vol. 69, pp. 155–173, 2017.
[14] B. Li, G. Yuan, L. Shen, R. Zhang, and Y. Yao, “Incorporating URL embedding into ensemble clustering to detect web anomalies,” Futur. Gener. Comput. Syst., vol. 96, pp. 176–184, 2019.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Journal for Research Publication and Seminar

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.