Perbandingan SVM dan IndoBERT untuk Analisis Sentimen Layanan Akademik Mahasiswa
Abstract
Digital transformation in higher education has generated an increasing volume of textual data, including student comments, academic service evaluations, and feedback on academic information systems. These data contain valuable information for supporting decision-making; however, their unstructured and contextual nature makes manual analysis inefficient. This study aims to compare the performance of a TF-IDF-based Support Vector Machine (SVM) model and a Transformer-based IndoBERT model for sentiment analysis of academic services from student feedback. The dataset consists of 1,700 text entries, combining template-based synthetic data and real-world data collected from social media, which were classified into positive, negative, and neutral sentiment categories. The research process involved exploratory data analysis, text preprocessing, feature extraction, model development, and evaluation using accuracy, precision, recall, and F1-score metrics. The experimental results showed that both models achieved very high performance on the dataset, with an accuracy of 100% on the test set. These findings indicate that both traditional machine learning and Transformer-based approaches are capable of identifying sentiment patterns within the dataset. Nevertheless, the results should be interpreted cautiously, as the relatively homogeneous nature of the dataset and the inclusion of synthetic data may affect the models’ generalizability. The main contribution of this study lies in the comparative evaluation of SVM and IndoBERT within the context of higher education academic services, as well as the development of a sentiment analysis framework that can support data-driven service quality monitoring. Future studies should employ larger, more diverse datasets derived entirely from real-world sources to further validate the findings.
References
Alaparthi, S., & Mishra, M. (n.d.). Bidirectional Encoder Representations from Transformers (BERT): A sentiment analysis odyssey.
Bharambe, U., Narvekar, C., & Andugula, P. (2022). Ontology and knowledge graphs for semantic analysis in natural language processing. In Graph Learning and Network Science for Natural Language Processing (pp. 105–130). CRC Press.
Brasoveanu, A., Moodie, M., & Agrawal, R. (2020). Textual evidence for the perfunctoriness of independent medical reviews. CEUR Workshop Proceedings, 2657, 1–9. https://doi.org/10.1145/nnnnnnn.nnnnnnn
Charoenkwan, P., Nantasenamat, C., Hasan, M. M., Manavalan, B., & Shoombuatong, W. (2021). BERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides. Bioinformatics, 37(17), 2556–2562. https://doi.org/10.1093/bioinformatics/btab133
Chowdhury, G. G. (2003). Natural language processing. In Annual Review of Information Science and Technology (Vol. 37). http://eprints.cdlr.strath.ac.uk/2611/
Church, K. W. (2017). Emerging Trends: Word2Vec. Natural Language Engineering, 23(1), 155–162. https://doi.org/10.1017/S1351324916000334
Di Gennaro, G., Buonanno, A., & Palmieri, F. A. N. (2021). Considerations about learning Word2Vec. Journal of Supercomputing, 77(11), 12320–12335. https://doi.org/10.1007/s11227-021-03743-2
Dwi Purnomo, T., & Sutopo, J. (2024). COMPARISON OF PRE-TRAINED BERT-BASED TRANSFORMER MODELS FOR REGIONAL LANGUAGE TEXT SENTIMENT ANALYSIS IN INDONESIA. (3), 11–21. https://doi.org/10.56127/ijst.v3i3.1
Farhan AlShammari, A. (2023). Implementation of Keyword Extraction using Term Frequency-Inverse Document Frequency (TF-IDF) in Python. In International Journal of Computer Applications (Vol. 185, Number 35).
Hapsari, A. (2025). Comparative Analysis of Trademark Class Identification Using IndoBERT and Multilingual BERT. In Journal of Artificial Intelligence and Legal Technology (Vol. 1, Number 1). https://skm.dgip.go.id/.
Hasan, T., & Matin, A. (2021). Extract Sentiment from Customer Reviews: A Better Approach of TF-IDF and BOW-Based Text Classification Using N-Gram Technique (pp. 231–244). https://doi.org/10.1007/978-981-16-0586-4_19
Huang, S., Nianguang, C. A. I., Penzuti Pacheco, P., Narandes, S., Wang, Y., & Wayne, X. U. (2018). Applications of support vector machine (SVM) learning in cancer genomics. In Cancer Genomics and Proteomics (Vol. 15, Number 1, pp. 41–51). International Institute of Anticancer Research. https://doi.org/10.21873/cgp.20063
Jatnika, D., Bijaksana, M. A., & Suryani, A. A. (2019). Word2vec model analysis for semantic similarities in English words. Procedia Computer Science, 157, 160–167. https://doi.org/10.1016/j.procs.2019.08.153
Jwa, H., Oh, D., Park, K., Kang, J. M., & Lim, H. (2019). exBAKE: Automatic fake news detection model based on Bidirectional Encoder Representations from Transformers (BERT). Applied Sciences (Switzerland), 9(19). https://doi.org/10.3390/app9194062
Kang, Y., Cai, Z., Tan, C. W., Huang, Q., & Liu, H. (2020). Natural language processing (NLP) in management research: A literature review. In Journal of Management Analytics (Vol. 7, Number 2, pp. 139–172). Taylor and Francis Ltd. https://doi.org/10.1080/23270012.2020.1756939
Li, F., Jin, Y., Liu, W., Rawat, B. P. S., Cai, P., & Yu, H. (2019). Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study. JMIR Med Inform, 7(3), e14830. https://doi.org/10.2196/14830
Nadkarni, P. M., Ohno-Machado, L., & Chapman, W. W. (2011). Natural language processing: an introduction. Journal of the American Medical Informatics Association, 18(5), 544–551. https://doi.org/10.1136/amiajnl-2011-000464
Pisner, D. A., & Schnyer, D. M. (2020). Chapter 6 - Support vector machine. In A. Mechelli & S. Vieira (Eds.), Machine Learning (pp. 101–121). Academic Press. https://doi.org/https://doi.org/10.1016/B978-0-12-815739-8.00006-7
Safitri, Y. D. (2025). Automatic Analysis of Natural Disaster Messages on Social Media Using IndoBERT and Multilingual BERT. Telematika, 18(2), 91–104. https://doi.org/10.35671/telematika.v18i2.3140
Copyright (c) 2026 Muhammad Ibnu Sa'ad, Lailil Muflikhah, Fitra Abdurrachman Bachtiar

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).


.png)
.png)


