Pemodelan Biaya Sewa pada Data Pendidikan Internasional Menggunakan Pendekatan Machine Learning dan CRISP-DM
Abstract
Advances in machine learning drive its application in analyzing complex educational data. In international education, housing rent (Rent_USD) is a critical cost-of-living component showing significant variation across regions. These variations are influenced by geography, local economics, and educational environments, requiring systematic data modeling. This study aims to model Rent_USD using the CRISP-DM framework: Business Understanding, Data Understanding, Data Preparation, Modeling, and Evaluation. Three algorithms were employed: Decision Tree as the baseline, Random Forest as a comparison, and XGBoost as the primary model. To enhance performance, hyperparameter tuning was conducted via GridSearchCV. Model evaluation utilized Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R2). The experimental results demonstrate that the XGBoost algorithm delivers the most superior performance, achieving the lowest RMSE of 93.27 USD and an R2 of 0.96. This performance outperforms Random Forest (RMSE: 114.87, R2: 0.94) and Decision Tree (RMSE: 157.16, R2: 0.89). Furthermore, feature importance analysis revealed crucial findings: the Living Cost Index and Tuition Fee are the most dominant factors influencing Rent_USD variations, contributing 58.32% and 32.94% respectively. This research provides an empirical overview of machine learning applications in modeling international education costs and serves as a vital reference for future studies regarding educational data management and predictive analytics in global student mobility.
References
S. Alturki, L. Cohausz, and H. Stuckenschmidt, “Predicting Master’s students’ academic performance: an empirical study in Germany,” Smart Learning Environments, vol. 9, no. 1, Dec. 2022, doi: 10.1186/s40561-022-00220-y.
A. D. Riyanto, A. M. Wahid, and A. A. Pratiwi, “ANALYSIS OF FACTORS DETERMINING STUDENT SATISFACTION USING DECISION TREE, RANDOM FOREST, SVM, AND NEURAL NETWORKS: A COMPARATIVE STUDY,” Jurnal Teknik Informatika (Jutif), vol. 5, no. 4, pp. 187–196, Jul. 2024, doi: 10.52436/1.jutif.2024.5.4.2188.
A. Anugerah et al., “SISTEM REKOMENDASI JURUSAN KULIAH BAGI CALON MAHASISWA BARU UNIVERSITAS BSI MARGONDA FAKULTAS TEKNIK DAN INFORMATIKA MENGGUNAKAN ALGORITMA C4.5,” 2025.
Y. F. Munawar and A. Arisal, “Analisis Prediksi Harga Sewa Ruko Menggunakan Pendekatan Machine Learning,” RIGGS: Journal of Artificial Intelligence and Digital Business, vol. 4, no. 3, pp. 2538–2544, Aug. 2025, doi: 10.31004/riggs.v4i3.2184.
B. Wulan Sari and D. Prabowo, “Analisis Perbandingan Prediksi Harga Rumah Dengan Random Forest, Gradient Boosting, dan XGBoost,” Intellect : Indonesian Journal of Learning and Technological Innovation, vol. 4, no. 1, pp. 42–51, Jun. 2025, doi: 10.57255/intellect.v4i1.1385.
W. Mulia, M. Lista, and A. SSi, “Comparison of Random Forest, XGBoost, and LightGBM Methods in Estimating Airbnb Accommodation Rental Prices Based on Customers in New York City,” 2023.
S. Cao, W. Liao, and J. Huang, “Research on Renting Price Prediction Based on Machine Learning,” European Alliance for Innovation n.o., May 2024. doi: 10.4108/eai.8-12-2023.2344718.
A. Neyaz, A. Ahmed, A. Singh, G. Noida, and U. Pradesh, “Machine Learning for Rental Price Prediction: Regression Techniques and Random Forest Model.” [Online]. Available: https://ssrn.com/abstract=4587725
X. Wan, X. Li, L. Xiong, Y. Xu, and J. Tian, “Comparison and Optimization Strategies of Airbnb Rental Prediction Models: An Empirical Study Based on Linear Regression, XGBoost and Random Forest,” Advances in Economics, Management and Political Sciences, vol. 197, no. 1, pp. 148–162, Sep. 2025, doi: 10.54254/2754-1169/2025.lh27240.
A. Karim and A. Ernawati, “Uncovering Smartphone Brand Strategies through Specification-Based Clustering and Classification,” Buletin Ilmiah Informatika Teknologi, vol. 4, no. 1, pp. 24–32, Oct. 2025, doi: 10.58369/biit.v2i3.167.
G. Chairunisa et al., “Life Expectancy Prediction Using Decision Tree, Random Forest, Gradient Boosting, and XGBoost Regressions,” Jurnal Sintak, vol. 2, no. 2, 2024.
A. Yavuz Özalp and H. Akinci, “Comparison of tree-based machine learning algorithms in price prediction of residential real estate,” Gumushane Universitesi Fen Bilimleri Dergisi, vol. 14, no. 1, pp. 116–130, Mar. 2024, doi: 10.17714/gumusfenbil.1363531.
M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” 2021. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC
N. Lebkiri et al., “Using Machine Learning for Prediction Students Failure in Morocco: an Application of the CRISP-DM Methodology,” International Journal of Education and Information Technologies, vol. 15, pp. 344–352, Oct. 2021, doi: 10.46300/9109.2021.15.36.
R. Winurputra and D. E. Ratnawati, “Peramalan Penjualan Produk Menggunakan Extreme Gradient Boosting (XGBoost) dan Kerangka Kerja CRISP-DM untuk Pengoptimalan Manajemen Persediaan (Studi Kasus: UB Mart),” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 12, no. 2, pp. 417–428, Apr. 2025, doi: 10.25126/jtiik.2025129451.
D. Ruswanti, D. Susilo, and R. Riani, “Implementasi CRISP-DM pada Data Mining untuk Melakukan Prediksi Pendapatan dengan Algoritma C.45,” Go Infotech: Jurnal Ilmiah STMIK AUB, vol. 30, no. 1, pp. 111–121, Jun. 2024, doi: 10.36309/goi.v30i1.266.
Y. A. Singgalen, “Penerapan CRISP-DM dalam Klasifikasi Sentimen dan Analisis Perilaku Pembelian Layanan Akomodasi Hotel Berbasis Algoritma Decision Tree (DT),” Jurnal Sistem Komputer dan Informatika (JSON), vol. 5, no. 2, p. 237, Dec. 2023, doi: 10.30865/json.v5i2.7081.
S. Arti and E. Suherlan, “Evaluasi Kinerja Machine Learning dalam Memprediksi Kemampuan Adaptasi Mahasiswa pada Lingkungan Pembelajaran Daring,” Jurnal Pustaka AI (Pusat Akses Kajian Teknologi Artificial Intelligence), vol. 5, no. 1, pp. 50–57, Apr. 2025, doi: 10.55382/jurnalpustakaai.v5i1.901.
M. R. Givari, R. Mochamad, and Y. U. Sulaeman2, “Perbandingan Algoritma SVM, Random Forest Dan XGBoost Untuk Penentuan Persetujuan Pengajuan Kredit,” vol. 16, no. 1, 2022, [Online]. Available: https://journal.uniku.ac.id/index.php/ilkom
A. Armalia Raidani, H. Manurung, M. Sihombing, S. Informasi, and S. Kaputama, “PERBANDINGAN ALGORITMA XGBOOST DAN RANDOM FOREST DENGAN TEKNIK FEATURE ENGINEERING PADA KLASIFIKASI.” [Online]. Available: https://journaledutech.com/index.php/great
H. H. Sinaga and S. Agustian, “Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter,” Jurnal Nasional Teknologi dan Sistem Informasi, vol. 8, no. 3, pp. 107–114, Dec. 2022, doi: 10.25077/teknosi.v8i3.2022.107-114.
Copyright (c) 2026 Arif Nababan, Rezeki Lumban Gaol, Fauziah Rahmadhani

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).


.png)
.png)


