Indexing metadata

Comparison of Logistic Regression and Random Forest using Correlation-based Feature Selection for Phishing Website Detection


 
Dublin Core PKP Metadata Items Metadata for this Document
 
1. Title Title of document Comparison of Logistic Regression and Random Forest using Correlation-based Feature Selection for Phishing Website Detection
 
2. Creator Author's name, affiliation, country Farida Farida
 
2. Creator Author's name, affiliation, country Ali Mustopa; Universitas AMIKOM Yogyakarta
 
3. Subject Discipline(s) Ilmu Komputer
 
3. Subject Keyword(s)
 
4. Description Abstract

The world is currently experiencing mass developments in information technology, especially during the current pandemic, which requires all of us to learn and even work online. They are triggered much crime in the internet world. One of them is stealing internet user data through a fake website built like the original or called a phishing website. In this research , a classification model is needed to detect phishing websites using the best performance from one of the logistic regression and random forest classification algorithms to overcome the rise of phishing websites in cyberspace. Classification performance can be improved using the correlation-based feature selection (CFS) method to select the most influential attribute in detecting web phishing. Based on the test results, applying the logistic regression and random forest classification algorithm in the classification of web phishing resulted in an accuracy of 93.035% and 96.834%. After feature selection with CFS, the accuracy was 92.718% and 97.015%, respectively. On the Testing, There was an increase in accuracy in RandomForest by 0.181% and an insignificant decrease in logistic regression. The test results prove that feature selection with CFS can eliminate redundant attributes and the resulting classification algorithm accuracy is not much different when the details are complete and Random Forest has accuracy better than after using CSF.

Keywords: website phishing, classification, logistic regression, random forest, correlation-based 

 
5. Publisher Organizing agency, location Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer
 
6. Contributor Sponsor(s) Universitas AMIKOM Yogyakarta
 
7. Date (YYYY-MM-DD) 2023-01-31
 
8. Type Status & genre Peer-reviewed Article
 
8. Type Type
 
9. Format File format PDF
 
10. Identifier Uniform Resource Identifier https://sistemasi.org/index.php/stmsi/article/view/1832
 
10. Identifier Digital Object Identifier (DOI) https://doi.org/10.32520/stmsi.v12i1.1832
 
11. Source Title; vol., no. (year) SISTEMASI; Vol 12, No 1 (2023): Sistemasi: Jurnal Sistem Informasi
 
12. Language English=en id
 
13. Relation Supp. Files
 
14. Coverage Geo-spatial location, chronological period, research sample (gender, age, etc.)
 
15. Rights Copyright and permissions Copyright (c) 2023 Sistemasi:Jurnal Sistem Informasi