ICMNS 2023
Conference Management System
Main Site
Submission Guide
Register
Login
User List | Statistics
Abstract List | Statistics
Poster List
Paper List
Reviewer List
Presentation Video
Online Q&A Forum
Access Mode
Ifory System
:: Abstract ::

<< back

Enhanced Detection of Online Loan Fraud using Cost-Sensitive Weighted Random Forest with Recursive Feature Elimination and Cross Validation
Karina Agustina, Kartika Fithriasari, Dedy Dwi Prastyo

Department of Statistics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia
karinagustina123[at]gmail.com
kartika_f[at]statistika.its.ac.id
dedy-dp[at]statistika.its.ac.id


Abstract

Online loan is one of the innovations that combine credit distribution system with digital technology such that it can be accessed in easy, fast, and efficient ways. The ease of access provided not only encourages the increase in the growth of online loan application from year to year but also increases the risk of fraudulent transactions (fraud). Therefore, a fraud detection system is an essential requirement for credit financial institutions to minimize the risk of losses that may arise. This study compares between Random Forest and Cost-Sensitive Weighted Random Forest model to solve the class imbalance problem in online loan fraud data. Cost-Sensitive Weighted Random Forest is development of Random Forest model that use cost-function based on the misclassification rate of the instances for both majority and minority classes to improve the prediction ability of each tree and the overall performance of the ensemble. The trees are given weightage based on the quantity of error. The trees with lower error rate are given higher weight. This cost driven learning scheme is adapted to give more emphasis on learning the minority class instances. In addition, the Recursive Feature Elimination and Cross Validation method used to eliminate unimportant features biases in the classification results and to speed up the data processing. The proposed methods are tested on real online loan application datasets obtained from a private bank. The results of the study show that information about the device data used when submitting loan application has a considerable influence on decision making to classify the loan application as fraud or not. The findings also show that the Cost-Sensitive Weighted Random Forest works better than Random Forest because it has higher accuracy, F1 score, and AUC-ROC.

Keywords: Cost Sensitive Weighted Random Forest, Imbalance Dataset, Fraud Detection

Topic: MATHEMATICS AND STATISTICS

Plain Format | Corresponding Author (Karina Agustina)

Share Link

Share your abstract link to your social media or profile page

ICMNS 2023 - Conference Management System

Powered By Konfrenzi Ultimate 1.832M-Build6 © 2007-2026 All Rights Reserved