Penerapan Data Mining Untuk Klasifikasi Penduduk Miskin Di Kabupaten Labuhanbatu Menggunakan Random Forest Dan K-Nearest Neighbors

  • Andi Ernawati * Mail Universitas Pembangunan Pancabudi, Indonesia
  • Khairul Universitas Pembangunan Panca Budi, Indonesia
  • Zulham Sitorus Universitas Pembangunan Panca Budi, Indonesia
  • Muhammad Iqbal Universitas Pembangunan Panca Budi, Indonesia
  • Darmeli Nasution Universitas Pembangunan Panca Budi, Indonesia
Keywords: K-Nearest Neighbors; Random Forest; Data Mining; Miskin; Klasifikasi

Abstract

This study aims to apply and compare the performance of two data mining algorithms—Random Forest (RF) and K-Nearest Neighbors (KNN)—in classifying poverty status among residents of Labuhanbatu Regency. The dataset includes information on occupation, income, housing, and education from 21,137 individuals. After undergoing preprocessing, model training, hyperparameter optimization, and evaluation, both models were assessed using five key metrics: accuracy, precision, recall, F1-score, and AUC. The results show that Random Forest performed slightly better than KNN, achieving an accuracy of 0.6023, precision of 0.4827, recall of 0.4177, F1-score of 0.4479, and an AUC of 0.5681. In comparison, KNN obtained an accuracy of 0.5990, precision of 0.4771, recall of 0.4006, F1-score of 0.4355, and an AUC of 0.5622. Based on these findings, it can be concluded that Random Forest is more effective for poverty classification on this dataset, although the performance difference is relatively small.

References

Ernawati, A., & Wahyuni, S. (2024). Analisis Data Mining Pola Penggunaan Seluler dan Klasifikasi Perilaku Pengguna di Berbagai Perangkat Menggunakan Metode C4 . 5. 5(4), 162–168. https://doi.org/10.47065/bit.v5i2.1689

Kamaliah, A. (2024). AI Bisa Bantu Atasi Masalah Keluarga Miskin di Indonesia. Detikinet. https://inet.detik.com/cyberlife/d-7368981/ai-bisa-bantu-atasi-masalah-keluarga-miskin-di-indonesia

Khaerunisa, S., Nur Padilah, T., & Haerul Jaman, J. (2024). Implementasi Data Mining Menggunakan Metode Regresi Data Panel Untuk Memprediksi Capaian Indeks Pembangunan Manusia. JATI (Jurnal Mahasiswa Teknik Informatika), 7(5), 3399–3406. https://doi.org/10.36040/jati.v7i5.7260

Mukhlis, I. R., Hayam, U., Perbanas, W., Pipin, S. J., & Mikroskil, U. (2024). BIG DATA ( Mengenal Big Data & Implementasinya di Berbagai Bidang ) (Issue February).

Sakti, R., & Daulay, A. (2024). Analisis Kritis dan Pengembangan Algoritma K-Nearest Neighbor ( KNN ): Sebuah Tinjauan Literatur. 4(2), 131–141.

Saputra, F. A., & Iskandar, A. (2023). Data Mining Penerapan Asosiasi Apriori Dalam Penentuan Pola Penjualan. Journal of Computer System and Informatics (JoSYC), 4(4), 778–788. https://doi.org/10.47065/josyc.v4i4.4043

Sari, R. P. (2024). Apa itu Data Mining? Pengertian, Metode dan Penerapannya. Cloud Computing. https://www.cloudcomputing.id/pengetahuan-dasar/apa-itu-data-mining

Sis. (2024). Random Forest vs Decision Tree. https://sis.binus.ac.id/2024/04/02/random-forest-vs-decision-tree/

sumut. (2019). Sejarah Sumatera Utara. https://sumutprov.go.id/artikel/halaman/sejarah

Wahyuni, S. (2018). Implementation of Data Mining to Analyze Drug Cases Using C4.5 Decision Tree. Journal of Physics: Conference Series, 970(1). https://doi.org/10.1088/1742-6596/970/1/012030

Ali, M. M., Hariyati, T., Pratiwi, M. Y., & Afifah, S. (2022). Metodologi Penelitian Kuantitatif dan Penerapannya dalam Penelitian. Education Journal.2022, 2(2), 1–6.

BPS. (2024). Profil Kemiskinan Kabupaten Labuhanbatu Maret 2024. BPS. https://labuhanbatukab.bps.go.id/id/pressrelease/2024/07/29/266/profil-kemiskinan-kabupaten-labuhanbatu-maret-2024.html

BREIMAN, L. (2001). Random Forests LEO. Kluwer Academic Publishers. Manufactured in The Netherlands, 12343 LNCS, 503–515. https://doi.org/10.1007/978-3-030-62008-0_35

Sitorus, Z., Hariyanto, E., & Kurniawan, F. (2023). Analysis of Artificial Intelligence Machine Learning Technology for Mapping and Predicting Flood Locations in Pahlawan Batu Bara Village. 2(2). CV Hawari

Suci Amaliah, Nusrang, M., & Aswi, A. (2022). Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng. VARIANSI: Journal of Statistics and Its Application on Teaching and Research, 4(3), 121–127. https://doi.org/10.35580/variansiunm31

sumut. (2019). Sejarah Sumatera Utara. https://sumutprov.go.id/artikel/halaman/sejarah

W, R. S. A., Hariyanto, E., & Sitorus, Z. (2022). COMPARISONAL ANALYSIS OF EUCLIDEAN , CANBERRA , AND CHEBECHEV DISTANCE MODELS ON KNN METHOD ON STUDENTS ’ VALUE. 10(3), 315–318.

Wikipedia. (2024). Kabupaten Labuhanbatu. https://id.wikipedia.org/wiki/Kabupaten_Labuhanbatu

B. Bangun and A. K. Karim, “Pengembalian Data Yang Hilang Pada Dataset Dengan Menggunakan Algoritma K-Nearest Neighbor Imputation Data Mining,” Jurnal Media Informatika Budidarma, vol. 8, no. 3, p. 1706, 2024, doi: 10.30865/mib.v8i3.8014.

A. Karim, “Penerapan Algoritma Entropy dan Aras Menentukan Desa Terbaik Di Pemerintah Kabupaten Labuhanbatu,” vol. 3, no. 1, pp. 33–43, 2022.

A. Karim, “Implementation of the Multi-Objective Optimization Method on the Basic of Ratio Analysis ( MOORA ) and Entropy Weighting in New Employee Recruitment,” vol. 5, no. 2, pp. 704–712, 2024, doi: 10.47065/josh.v5i2.4859.

A. Karim, “Clusterisasi Tingkat Pengangguran Terbuka Menurut Provinsi di Indonesia Menggunakan Algoritma K-Medoids,” 2024, doi: 10.47065/bits.v6i3.6198.

A. Karim, “Sistem Pendukung Keputusan Penerimaan Analis Di Pusat Penelitian Kelapa Sawit Menggunakan Metode Complex Proportional Assessment (Copras),” Buletin Ilmiah Informatika Teknologi, vol. 2, no. 1, pp. 32–42, [Online]. Available: https://ejurnal.amikstiekomsu.ac.id/index.php/BIIT

Abdul Karim, “Implementasi Metode Multi-Objective Optimization On The Basis Of Ratio Analysis dalam Seleksi Mahasiswa Program Indonesia Pintar,” Bulletin of Computer Science Research, vol. 3, no. 5, pp. 351–356, 2023, doi: 10.47065/bulletincsr.v3i5.283.

Z. Budiarso, H. Listiyono, and A. Karim, “Optimizing LSTM with Grid Search and Regularization Techniques to Enhance Accuracy in Human Activity Recognition,” Journal of Applied Data Sciences, vol. 5, no. 4, pp. 2002–2014, Dec. 2024, doi: 10.47738/jads.v5i4.433.

Dimensions Badge
Published
2025-06-03
How to Cite
Ernawati, A., Khairul, Sitorus, Z., Iqbal, M., & Nasution, D. (2025). Penerapan Data Mining Untuk Klasifikasi Penduduk Miskin Di Kabupaten Labuhanbatu Menggunakan Random Forest Dan K-Nearest Neighbors. Bulletin of Information Technology (BIT), 6(2), 23 - 35. https://doi.org/10.47065/bit.v6i1.1783
Section
Articles