Perbandingan SVM dan IndoBERT untuk Analisis Sentimen Layanan Akademik Mahasiswa

Muhammad Ibnu Sa'ad; Lailil Muflikhah; Fitra Abdurrachman Bachtiar

doi:10.47065/bit.v7i2.2913

Muhammad Ibnu Sa'ad * STMIK Widya Cipta Dharma, Indonesia
Lailil Muflikhah Universitas Brawijaya, Indonesia
Fitra Abdurrachman Bachtiar Universitas Brawijaya, Indonesia

DOI: https://doi.org/10.47065/bit.v7i2.2913

Keywords: Natural Language Processing, Analisis Sentimen, IndoBERT, Support Vector Machine, Layanan Akademik

Abstract

Digital transformation in higher education has generated an increasing volume of textual data, including student comments, academic service evaluations, and feedback on academic information systems. These data contain valuable information for supporting decision-making; however, their unstructured and contextual nature makes manual analysis inefficient. This study aims to compare the performance of a TF-IDF-based Support Vector Machine (SVM) model and a Transformer-based IndoBERT model for sentiment analysis of academic services from student feedback. The dataset consists of 1,700 text entries, combining template-based synthetic data and real-world data collected from social media, which were classified into positive, negative, and neutral sentiment categories. The research process involved exploratory data analysis, text preprocessing, feature extraction, model development, and evaluation using accuracy, precision, recall, and F1-score metrics. The experimental results showed that both models achieved very high performance on the dataset, with an accuracy of 100% on the test set. These findings indicate that both traditional machine learning and Transformer-based approaches are capable of identifying sentiment patterns within the dataset. Nevertheless, the results should be interpreted cautiously, as the relatively homogeneous nature of the dataset and the inclusion of synthetic data may affect the models’ generalizability. The main contribution of this study lies in the comparative evaluation of SVM and IndoBERT within the context of higher education academic services, as well as the development of a sentiment analysis framework that can support data-driven service quality monitoring. Future studies should employ larger, more diverse datasets derived entirely from real-world sources to further validate the findings.

References

Alaparthi, S., & Mishra, M. (n.d.). Bidirectional Encoder Representations from Transformers (BERT): A sentiment analysis odyssey.

Bharambe, U., Narvekar, C., & Andugula, P. (2022). Ontology and knowledge graphs for semantic analysis in natural language processing. In Graph Learning and Network Science for Natural Language Processing (pp. 105–130). CRC Press.

Brasoveanu, A., Moodie, M., & Agrawal, R. (2020). Textual evidence for the perfunctoriness of independent medical reviews. CEUR Workshop Proceedings, 2657, 1–9. https://doi.org/10.1145/nnnnnnn.nnnnnnn

Charoenkwan, P., Nantasenamat, C., Hasan, M. M., Manavalan, B., & Shoombuatong, W. (2021). BERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides. Bioinformatics, 37(17), 2556–2562. https://doi.org/10.1093/bioinformatics/btab133

Chowdhury, G. G. (2003). Natural language processing. In Annual Review of Information Science and Technology (Vol. 37). http://eprints.cdlr.strath.ac.uk/2611/

Church, K. W. (2017). Emerging Trends: Word2Vec. Natural Language Engineering, 23(1), 155–162. https://doi.org/10.1017/S1351324916000334

Di Gennaro, G., Buonanno, A., & Palmieri, F. A. N. (2021). Considerations about learning Word2Vec. Journal of Supercomputing, 77(11), 12320–12335. https://doi.org/10.1007/s11227-021-03743-2

Dwi Purnomo, T., & Sutopo, J. (2024). COMPARISON OF PRE-TRAINED BERT-BASED TRANSFORMER MODELS FOR REGIONAL LANGUAGE TEXT SENTIMENT ANALYSIS IN INDONESIA. (3), 11–21. https://doi.org/10.56127/ijst.v3i3.1

Farhan AlShammari, A. (2023). Implementation of Keyword Extraction using Term Frequency-Inverse Document Frequency (TF-IDF) in Python. In International Journal of Computer Applications (Vol. 185, Number 35).

Hapsari, A. (2025). Comparative Analysis of Trademark Class Identification Using IndoBERT and Multilingual BERT. In Journal of Artificial Intelligence and Legal Technology (Vol. 1, Number 1). https://skm.dgip.go.id/.

Hasan, T., & Matin, A. (2021). Extract Sentiment from Customer Reviews: A Better Approach of TF-IDF and BOW-Based Text Classification Using N-Gram Technique (pp. 231–244). https://doi.org/10.1007/978-981-16-0586-4_19

Huang, S., Nianguang, C. A. I., Penzuti Pacheco, P., Narandes, S., Wang, Y., & Wayne, X. U. (2018). Applications of support vector machine (SVM) learning in cancer genomics. In Cancer Genomics and Proteomics (Vol. 15, Number 1, pp. 41–51). International Institute of Anticancer Research. https://doi.org/10.21873/cgp.20063

Jatnika, D., Bijaksana, M. A., & Suryani, A. A. (2019). Word2vec model analysis for semantic similarities in English words. Procedia Computer Science, 157, 160–167. https://doi.org/10.1016/j.procs.2019.08.153

Jwa, H., Oh, D., Park, K., Kang, J. M., & Lim, H. (2019). exBAKE: Automatic fake news detection model based on Bidirectional Encoder Representations from Transformers (BERT). Applied Sciences (Switzerland), 9(19). https://doi.org/10.3390/app9194062

Kang, Y., Cai, Z., Tan, C. W., Huang, Q., & Liu, H. (2020). Natural language processing (NLP) in management research: A literature review. In Journal of Management Analytics (Vol. 7, Number 2, pp. 139–172). Taylor and Francis Ltd. https://doi.org/10.1080/23270012.2020.1756939

Li, F., Jin, Y., Liu, W., Rawat, B. P. S., Cai, P., & Yu, H. (2019). Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study. JMIR Med Inform, 7(3), e14830. https://doi.org/10.2196/14830

Nadkarni, P. M., Ohno-Machado, L., & Chapman, W. W. (2011). Natural language processing: an introduction. Journal of the American Medical Informatics Association, 18(5), 544–551. https://doi.org/10.1136/amiajnl-2011-000464

Pisner, D. A., & Schnyer, D. M. (2020). Chapter 6 - Support vector machine. In A. Mechelli & S. Vieira (Eds.), Machine Learning (pp. 101–121). Academic Press. https://doi.org/https://doi.org/10.1016/B978-0-12-815739-8.00006-7

Safitri, Y. D. (2025). Automatic Analysis of Natural Disaster Messages on Social Media Using IndoBERT and Multilingual BERT. Telematika, 18(2), 91–104. https://doi.org/10.35671/telematika.v18i2.3140