TOPIC ANALYSIS OF STUDENT FEEDBACK ON LEARNING MANAGEMENT SYSTEMS USING BERTOPIC: A COMPARATIVE STUDY OF INDOBERT, DISTILBERT, AND SBERT
DOI:
https://doi.org/10.15575/jb.v4i2.54192Abstract
The widespread adoption of Learning Management Systems (LMS) in digital education has generated large volumes of student feedback in the form of unstructured free-text data, making manual analysis increasingly impractical. This study aims to identify the dominant themes emerging from student feedback on LMS platforms and to compare the performance of different Transformer-based embedding models in topic modeling tasks. The proposed approach employs BERTopic with three embedding models, namely IndoBERT, DistilBERT, and Sentence-BERT (SBERT). Student feedback data were collected from an institutional LMS and processed through text preprocessing, embedding generation, and topic modeling stages. Model performance was evaluated using multiple coherence metrics (c_v, c_npmi, u_mass, and c_uci), topic diversity, and the proportion of outlier documents. The results indicate that the IndoBERT-family model achieves the highest coherence scores, particularly in c_v and c_npmi, suggesting superior semantic consistency of the generated topics. DistilBERT produces the lowest proportion of outliers but yields a more limited number of topics, while SBERT demonstrates a balanced performance between topic quality and thematic diversity. These findings highlight that the choice of embedding model significantly influences the quality of topic modeling outcomes for Indonesian-language student feedback data.
References
Abdurrazzaq, M. A. (n.d.). Analisis Ulasan Aplikasi MyPertamina Menggunakan Topic modeling dengan Latent Dirichlet Allocation. Jurnal Sains Dan Teknologi, 10(1).
Abella, Á. R., Silvestre, J. P., & Tabuada, P. (2024). The Asymptotic Behavior of Attention in Transformers. 1–26. http://arxiv.org/abs/2412.02682
Ahammad, T. (2024). Identifying hidden patterns of fake COVID-19 news: An in-depth sentiment analysis and topic modeling approach. Natural Language Processing Journal, 6(January), 100053. https://doi.org/10.1016/j.nlp.2024.100053
Akdeas Oktanae Widodo, Septiadi, F., & Nur Aini Rakhmawati. (2023). Analisis Tren Konten Pada Vtuber Indonesia Menggunakan Latent Dirichlet Allocation. Jurnal Informatika Dan Rekayasa Elektronik, 6(1), 56–63. https://doi.org/10.36595/jire.v6i1.718
Alamsyah, A., & Girawan, N. D. (2023). Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language Model. Big Data and Cognitive Computing, 7(4). https://doi.org/10.3390/bdcc7040168
Allenbrand, C. (2024). Supervised and unsupervised learning models for pharmaceutical drug rating and classification using consumer generated reviews. Healthcare Analytics, 5(December 2023), 100288. https://doi.org/10.1016/j.health.2023.100288
Alonso-Dos-Santos, M., Sánchez Franco, M., Calabuig, F., & González-Serrano, M. H. (2023). Modelling the structure of the sports management research field using the bertopic approach. Retos, 47, 648–663. https://doi.org/10.47197/retos.v47.93622
Alryalat, S. A., Qasem, A., Albdour, K., & Rawashdeh, B. (2023). Assessment of Topics Published in Leading Medical Journals Using Natural Language Processing. High Yield Medical Reviews, 1(1), 1–8. https://doi.org/10.59707/hymrhmdo2739
An, Y., Oh, H., & Lee, J. (2023). Marketing Insights from Reviews Using Topic modeling with BERTopic and Deep Clustering Network. Applied Sciences (Switzerland), 13(16). https://doi.org/10.3390/app13169443
Arif Fitra Setyawan, Amelia Devi Putri Ariyanto, Fari Katul Fikriah, & Rozaq Isnaini Nugraha. (2024). Analisis Sentimen Ulasan iPhone di Amazon Menggunakan Model Deep Learning BERT Berbasis Transformer. Elkom: Jurnal Elektronika Dan Komputer, 17(2), 447–452. https://doi.org/10.51903/elkom.v17i2.2150
Arslan, M., & Cruz, C. (2023). Leveraging NLP approaches to define and implement text relevance hierarchy framework for business news classification. Procedia Computer Science, 225, 317–326. https://doi.org/10.1016/j.procs.2023.10.016
Aryadi, J. A., Basith, Y. A. A., Munawir, M., & Agustini, D. A. R. (2023). Analisis Data Review Hotel di Google Maps Melalui Text Mining (Studi Kasus: Kabupanten Bandung). JIKO (Jurnal Informatika dan Komputer), 7(2), 312. https://doi.org/10.26798/jiko.v7i2.938
Bachoumis, A., Mylonas, C., Plakas, K., Birbas, M., & Birbas, A. (2023). Data-Driven Analytics for Reliability in the Buildings-to-Grid Integrated System Framework: A Systematic Text-Mining-Assisted Literature Review and Trend Analysis. IEEE Access, 11(October), 130763–130787. https://doi.org/10.1109/ACCESS.2023.3335191
Bu, W., Shu, H., Kang, F., Hu, Q., & Zhao, Y. (2023). Software Subclassification Based on BERTopic-BERT-BiLSTM Model. Electronics (Switzerland), 12(18). https://doi.org/10.3390/electronics12183798
Chagnon, E., Pandolfi, R., Donatelli, J., & Ushizima, D. (2024). Benchmarking topic models on scientific articles using BERTeley. Natural Language Processing Journal, 6(October 2023), 100044. https://doi.org/10.1016/j.nlp.2023.100044
Colantoni, F. (2023). The impact of corporate governance on default risk: BERTopic literature review. Corporate Ownership and Control, 20(4), 57–71. https://doi.org/10.22495/cocv20i4art4
Dahlke, J. (2024). Artificial intelligence as a sociotechnical system: Integrating technical design, human goals, and social expectations. Journal of Artificial Intelligence Studies, 12(1), 45–60.
Herwinsyah. (2023). Pemodelan Topik Dalam Al-Qur’an Menggunakan Library. Simetris, 14(2), 319–327.
Grootendorst, M. (2022). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794. https://arxiv.org/abs/2203.05794
Jamaaluddin, & Sulistyowati, I. (2021). Buku Ajar Kecerdasan Buatan. Umsida Press, 121.
Khadijah, U. N., & Cahyono, N. (2024). Analisis Topic Modelling Pariwisata Yogyakarta Menggunakan Latent Dirichlet Allocation (LDA). The Indonesian Journal of Computer Science, 13(4).
Masruriyah, A. F. N., Sukmawati, C. E., & Novita, H. Y. (2022). Pengelompokan Topik Cuitan Pengguna Twitter Terhadap Kuliah Kerja Nyata (KKN) di Indonesia Menggunakan Latent Dirichlet Allocation. Konferensi Nasional Penelitian Dan Pengabdian (KNPP), 3, 1128–1133.
Matira, Y., & Setiawan, I. (2023). Pemodelan Topik pada Judul Berita Online Detikcom Menggunakan Latent Dirichlet Allocation. Estimasi: Journal of Statistics and Its Application, 4(1), 2379–2721. https://doi.org/10.20956/ejsa.vi.24843
Maulidiya, D. (2023). Topic Modelling using Latent Dirichlet Allocation (LDA) to Investigate the Latent Topics of Mathematical Creative Thinking Research in Indonesia. Journal of Intelligent Computing & Health Informatics, 3(2), 35. https://doi.org/10.26714/jichi.v3i2.11428
Mueller, J. P., & Massaron, L. (2018). Artificial Intelligence for Dummies. John Wiley & Sons, Inc.
Mulia, A., & Dzikrillah, A. R. (2023a). Analisis Perbedaan Pendapat Netizen Indonesia tentang Presiden Jokowi sebelum dan sesudah Kenaikan Harga BBM Analysis of Indonesian Netizens’ Dissent on President Jokowi before and after Fuel Price Increase. Journal of Computing Engineering, System and Science), 8(2), 318–328.
Mulia, A., & Dzikrillah, A. R. (2023b). Analisis Perbedaan Pendapat Netizen Indonesia tentang Presiden Jokowi sebelum dan sesudah Kenaikan Harga BBM Analysis of Indonesian Netizens’ Dissent on President Jokowi before and after Fuel Price Increase. Journal of Computing Engineering, System and Science), 8(2), 318–328.
Naghshzan, A., & Ratte, S. (2023). Enhancing API Documentation through BERTopic modeling and Summarization. http://arxiv.org/abs/2308.09070
Niroomand, K., Saady, N. M. C., Bazan, C., Zendehboudi, S., Soares, A., & Albayati, T. M. (2023). Smart investigation of artificial intelligence in renewable energy system technologies by natural language processing: Insightful pattern for decision-makers. Engineering Applications of Artificial Intelligence, 126(PA), 106848. https://doi.org/10.1016/j.engappai.2023.106848
Ojo, A. O., & Bouguila, N. (2024). A topic modeling and image classification framework: The Generalized Dirichlet variational autoencoder. Pattern Recognition, 146(October 2023), 110037. https://doi.org/10.1016/j.patcog.2023.110037
Parlina, A., & Maryati, I. (2023). Leveraging BERTopic for the Analysis of Scientific Papers on Seaweed. Proceedings - 2023 10th International Conference on Computer, Control, Informatics and Its Applications: Exploring the Power of Data: Leveraging Information to Drive Digital Innovation, IC3INA 2023, 2022, 279–283. https://doi.org/10.1109/IC3INA60834.2023.10285737
Rahman, R. D., Setiawan, N. Y., & Bachtiar, F. A. (2025). Analisis Sentimen Pengguna Aplikasi Mobile Berbasis Review Pada Platform Blibli Menggunakan Metode Bidirectional Encoder Representations from Transformers (BERT). Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 9(4), 2548–2964.
Rejeb, A., Rejeb, K., Appolloni, A., Jagtap, S., Iranmanesh, M., Alghamdi, S., Alhasawi, Y., & Kayikci, Y. (2024). Unleashing the power of internet of things and blockchain: A comprehensive analysis and future directions. Internet of Things and Cyber-Physical Systems, 4(June 2023), 1–18. https://doi.org/10.1016/j.iotcps.2023.06.003
Rosalinda, G., Santoso, R., & Kartikasari, P. (2023). Pemodelan Topik Ulasan Aplikasi Netflix Pada Google Play Store Menggunakan Latent Dirichlet Allocation. Jurnal Gaussian, 11(4), 554–561. https://doi.org/10.14710/j.gauss.11.4.554-561
Saidi, F., Trabelsi, Z., & Thangaraj, E. (2022). A novel framework for semantic classification of cyber terrorist communities on Twitter. Engineering Applications of Artificial Intelligence, 115(January), 105271. https://doi.org/10.1016/j.engappai.2022.105271
Samsir, S., Saragih, R. S., Subagio, S., Aditiya, R., & Watrianthos, R. (2023). BERTopic modeling of Natural Language Processing Abstracts: Thematic Structure and Trajectory. Jurnal Media Informatika Budidarma, 7(3), 1514. https://doi.org/10.30865/mib.v7i3.6426
Suryotrisongko, H., Ginardi, H., Ciptaningtyas, H. T., Dehqan, S., & Musashi, Y. (2022). Topic modeling for Cyber Threat Intelligence (CTI). 2022 7th International Conference on Informatics and Computing, ICIC 2022, 1–7. https://doi.org/10.1109/ICIC56845.2022.10006988
Tondang, B. A., Fadhil, M. R., Perdana, M. N., Fauzi, A., & Janitra, U. S. (2023). Analisis pemodelan topik ulasan aplikasi BNI, BCA, dan BRI menggunakan latent dirichlet allocation. INFOTECH: Jurnal Informatika & Teknologi, 4(1), 114–127. https://doi.org/10.37373/infotech.v4i1.601
Wang, Y., Bashar, M. A., Chandramohan, M., & Nayak, R. (2023). Exploring topic models to discern cyber threats on Twitter: A case study on Log4Shell. Intelligent Systems with Applications, 20(March), 200280. https://doi.org/10.1016/j.iswa.2023.200280
Yvon, F. (2023). Transformers in Natural Language Processing. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13500 LNAI(May), 81–105. https://doi.org/10.1007/978-3-031-24349-3_6
Zain, R. M., Anggai, S., Tukiyat, Musyafa, A., & Waskita, A. A. (2024). Revealing a Country’s Government Discourse Through BERT-based Topic modeling in the US Presidential Speeches. International Conference on Computer, Control, Informatics and Its Applications, IC3INA, 2024, 191–196. https://doi.org/10.1109/IC3INA64086.2024.10732578
Zhang, D., Wu, X., Liu, P., Qin, H., & Zhou, W. (2023). Identification of Product Innovation Path Incorporating the FOS and BERTopic Model from the Perspective of Invalid Patents. Applied Sciences (Switzerland), 13(13). https://doi.org/10.3390/app13137987
Zou, T., Guo, P., Li, F., & Wu, Q. (2024). Research topic identification and trend prediction of China’s energy policy: A combined LDA-ARIMA approach. Renewable Energy, 220(February 2023), 119619. https://doi.org/10.1016/j.renene.2023.119619
Downloads
Published
How to Cite
Issue
Section
Citation Check
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


