Pemodelan Biaya Sewa pada Data Pendidikan Internasional Menggunakan Pendekatan Machine Learning dan CRISP-DM

  • Arif Nababan * Mail Indonesia
  • Nauli Politeknik Negeri Medan, Indonesia
  • Fauziah Rahmadhani Indonesia
Keywords: Machine Learning; Rent_USD; Decision Tree; Random Forest; XGBoost; CRISP-DM

Abstract

Advances in machine learning drive its application in analyzing complex educational data. In international education, housing rent (Rent_USD) is a critical cost-of-living component showing significant variation across regions. These variations are influenced by geography, local economics, and educational environments, requiring systematic data modeling. This study aims to model Rent_USD using the CRISP-DM framework: Business Understanding, Data Understanding, Data Preparation, Modeling, and Evaluation. Three algorithms were employed: Decision Tree as the baseline, Random Forest as a comparison, and XGBoost as the primary model. To enhance performance, hyperparameter tuning was conducted via GridSearchCV. Model evaluation utilized Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R2). The experimental results demonstrate that the XGBoost algorithm delivers the most superior performance, achieving the lowest RMSE of 93.27 USD and an R2 of 0.96. This performance outperforms Random Forest (RMSE: 114.87, R2: 0.94) and Decision Tree (RMSE: 157.16, R2: 0.89). Furthermore, feature importance analysis revealed crucial findings: the Living Cost Index and Tuition Fee are the most dominant factors influencing Rent_USD variations, contributing 58.32% and 32.94% respectively. This research provides an empirical overview of machine learning applications in modeling international education costs and serves as a vital reference for future studies regarding educational data management and predictive analytics in global student mobility.

References

S. Alturki, L. Cohausz, and H. Stuckenschmidt, “Predicting Master’s students’ academic performance: an empirical study in Germany,” Smart Learning Environments, vol. 9, no. 1, Dec. 2022, doi: 10.1186/s40561-022-00220-y.

A. D. Riyanto, A. M. Wahid, and A. A. Pratiwi, “ANALYSIS OF FACTORS DETERMINING STUDENT SATISFACTION USING DECISION TREE, RANDOM FOREST, SVM, AND NEURAL NETWORKS: A COMPARATIVE STUDY,” Jurnal Teknik Informatika (Jutif), vol. 5, no. 4, pp. 187–196, Jul. 2024, doi: 10.52436/1.jutif.2024.5.4.2188.

A. Anugerah et al., “SISTEM REKOMENDASI JURUSAN KULIAH BAGI CALON MAHASISWA BARU UNIVERSITAS BSI MARGONDA FAKULTAS TEKNIK DAN INFORMATIKA MENGGUNAKAN ALGORITMA C4.5,” 2025.

Y. F. Munawar and A. Arisal, “Analisis Prediksi Harga Sewa Ruko Menggunakan Pendekatan Machine Learning,” RIGGS: Journal of Artificial Intelligence and Digital Business, vol. 4, no. 3, pp. 2538–2544, Aug. 2025, doi: 10.31004/riggs.v4i3.2184.

B. Wulan Sari and D. Prabowo, “Analisis Perbandingan Prediksi Harga Rumah Dengan Random Forest, Gradient Boosting, dan XGBoost,” Intellect : Indonesian Journal of Learning and Technological Innovation, vol. 4, no. 1, pp. 42–51, Jun. 2025, doi: 10.57255/intellect.v4i1.1385.

W. Mulia, M. Lista, and A. SSi, “Comparison of Random Forest, XGBoost, and LightGBM Methods in Estimating Airbnb Accommodation Rental Prices Based on Customers in New York City,” 2023.

S. Cao, W. Liao, and J. Huang, “Research on Renting Price Prediction Based on Machine Learning,” European Alliance for Innovation n.o., May 2024. doi: 10.4108/eai.8-12-2023.2344718.

A. Neyaz, A. Ahmed, A. Singh, G. Noida, and U. Pradesh, “Machine Learning for Rental Price Prediction: Regression Techniques and Random Forest Model.” [Online]. Available: https://ssrn.com/abstract=4587725

X. Wan, X. Li, L. Xiong, Y. Xu, and J. Tian, “Comparison and Optimization Strategies of Airbnb Rental Prediction Models: An Empirical Study Based on Linear Regression, XGBoost and Random Forest,” Advances in Economics, Management and Political Sciences, vol. 197, no. 1, pp. 148–162, Sep. 2025, doi: 10.54254/2754-1169/2025.lh27240.

A. Karim and A. Ernawati, “Uncovering Smartphone Brand Strategies through Specification-Based Clustering and Classification,” Buletin Ilmiah Informatika Teknologi, vol. 4, no. 1, pp. 24–32, Oct. 2025, doi: 10.58369/biit.v2i3.167.

G. Chairunisa et al., “Life Expectancy Prediction Using Decision Tree, Random Forest, Gradient Boosting, and XGBoost Regressions,” Jurnal Sintak, vol. 2, no. 2, 2024.

A. Yavuz Özalp and H. Akinci, “Comparison of tree-based machine learning algorithms in price prediction of residential real estate,” Gumushane Universitesi Fen Bilimleri Dergisi, vol. 14, no. 1, pp. 116–130, Mar. 2024, doi: 10.17714/gumusfenbil.1363531.

M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” 2021. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC

N. Lebkiri et al., “Using Machine Learning for Prediction Students Failure in Morocco: an Application of the CRISP-DM Methodology,” International Journal of Education and Information Technologies, vol. 15, pp. 344–352, Oct. 2021, doi: 10.46300/9109.2021.15.36.

R. Winurputra and D. E. Ratnawati, “Peramalan Penjualan Produk Menggunakan Extreme Gradient Boosting (XGBoost) dan Kerangka Kerja CRISP-DM untuk Pengoptimalan Manajemen Persediaan (Studi Kasus: UB Mart),” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 12, no. 2, pp. 417–428, Apr. 2025, doi: 10.25126/jtiik.2025129451.

D. Ruswanti, D. Susilo, and R. Riani, “Implementasi CRISP-DM pada Data Mining untuk Melakukan Prediksi Pendapatan dengan Algoritma C.45,” Go Infotech: Jurnal Ilmiah STMIK AUB, vol. 30, no. 1, pp. 111–121, Jun. 2024, doi: 10.36309/goi.v30i1.266.

Y. A. Singgalen, “Penerapan CRISP-DM dalam Klasifikasi Sentimen dan Analisis Perilaku Pembelian Layanan Akomodasi Hotel Berbasis Algoritma Decision Tree (DT),” Jurnal Sistem Komputer dan Informatika (JSON), vol. 5, no. 2, p. 237, Dec. 2023, doi: 10.30865/json.v5i2.7081.

S. Arti and E. Suherlan, “Evaluasi Kinerja Machine Learning dalam Memprediksi Kemampuan Adaptasi Mahasiswa pada Lingkungan Pembelajaran Daring,” Jurnal Pustaka AI (Pusat Akses Kajian Teknologi Artificial Intelligence), vol. 5, no. 1, pp. 50–57, Apr. 2025, doi: 10.55382/jurnalpustakaai.v5i1.901.

M. R. Givari, R. Mochamad, and Y. U. Sulaeman2, “Perbandingan Algoritma SVM, Random Forest Dan XGBoost Untuk Penentuan Persetujuan Pengajuan Kredit,” vol. 16, no. 1, 2022, [Online]. Available: https://journal.uniku.ac.id/index.php/ilkom

A. Armalia Raidani, H. Manurung, M. Sihombing, S. Informasi, and S. Kaputama, “PERBANDINGAN ALGORITMA XGBOOST DAN RANDOM FOREST DENGAN TEKNIK FEATURE ENGINEERING PADA KLASIFIKASI.” [Online]. Available: https://journaledutech.com/index.php/great

H. H. Sinaga and S. Agustian, “Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter,” Jurnal Nasional Teknologi dan Sistem Informasi, vol. 8, no. 3, pp. 107–114, Dec. 2022, doi: 10.25077/teknosi.v8i3.2022.107-114.

Dimensions Badge
Published
2026-03-26
How to Cite
Nababan, A., Lumban Gaol, R., & Rahmadhani, F. (2026). Pemodelan Biaya Sewa pada Data Pendidikan Internasional Menggunakan Pendekatan Machine Learning dan CRISP-DM. Bulletin of Information Technology (BIT), 7(1), 22 - 30. https://doi.org/10.47065/bit.v7i1.2557
Section
Articles