Abstract: In recent times, phishing attacks and online identity theft have posed a significant threat to cybersecurity, utilizing fraudulent websites to deceive users into disclosing sensitive data. Phishing is a form of social engineering where attackers disseminate false information via fraudulent websites to deceive victims into disclosing personal data, either to acquire further information or to achieve financial gain. Given the rapid evolution of technology and phishing tactics, coupled with the increasingly frequent exchange of information online, effective methods for detecting fraudulent URLs are essential. The objective of this study was to evaluate the effectiveness of various machine and deep learning models in classifying malicious and legitimate web addresses without analyzing page content. Experimental results demonstrate that convolutional neural networks (CNNs) can achieve an accuracy of up to 98.7%, while ensemble models such as Random Forest and XGBoost also exhibit high accuracy exceeding 96%, thereby significantly outperforming traditional approaches like logistic regression. As phishing strategies continue to evolve, adaptive models such as ensemble learning techniques and deep learning architectures will be pivotal for safeguarding online security and for comprehending the effective mitigation of emerging cyber threats.
Keywords: social engineering, ensemble models, cyber attacks, URL classification, SMOTE



