Logistic Regression Approach with Rare Event Weighted Logistic Regression and Majority Weighted Minority Oversampling Technique for Imbalanced Data (Case Study: Child Participation in Economic Activities in Southeast Sulawesi) Regina Hayden Sagita (a*), Ismaini Zain (a), Vita Ratnasari (a)
a) Statistics Department, Institut Teknologi Sepuluh Nopember
Jl. Arief Rahman Hakim, Surabaya 60111 Indonesia
*reginahaydensagita[at]gmail.com
Abstract
Indonesia is one of the countries with a fairly high number of child workers. The Central Bureau of Statistics recorded 1.17 million people aged 10-17 years working in the country in 2020 with an ever-increasing percentage. The highest child labor data in Indonesia was recorded in the Southeast Sulawesi region. According to Susenas data, for ages 10-14 years, there was 4.89 percent child labor. Child labor is closely related to poverty, meaning that family environmental factors and education level are determinants of childrens participation in economic activities. Classification is the process of finding a set of models with the aim that these models can be used to predict the class of an object or data. The problem in data classification is the composition of imbalanced data. In binary classification or two classes, one class has a larger number of samples than the other class, so this study will apply the Rare Event Weighted Logistic Regression (RE-WLR) and Majority Weighted Minority Oversampling Technique (MWMOTE) methods. The case of the level of participation of children in the economy, namely, among the proportion of children aged 10-14 years who work in 2019 shows this case is included in the imbalanced data category. The main topic of this study is how the results of the comparison of the RE-WLR and MWMOTE methods as well as the socio-economic conditions of families affect the status of child labor in the family, to find out what child workers are that affect the level of child labor in Southeast Sulawesi, so that the influential indicators can be addressed in order to reduce the number of workers child. The results show that the factors that influence child labor are the child age, sex of the child, schooling status of the child, age of the head of the household, gender of the head of the household, location of residence, business field of the head of the household, and employment status of the head of the household.