Implementation of Naive Bayes and Support Vector Machine Classification Algorithms for Sentiment Analysis of Bilingual Cyberbullying on X Application

Novita Sari, Muhammad Jazman, Tengku Khairil Ahsyar, Syaifullah Syaifullah, Arif Marsal

Abstract


The significant increase in social media usage has contributed to the rise in cyberbullying incidents, particularly in the context of multilingual language use. This study aims to conduct sentiment analysis to detect potential cyberbullying content on the X application using a bilingual approach (Indonesian and English) and leveraging the Naive Bayes (NB) and Support Vector Machine (SVM) algorithms. Tweets are collected and processed through a pre-processing stage to extract relevant features for sentiment analysis. Both algorithms are then applied to classify tweets into positive, negative, or neutral categories and identify indications of cyberbullying. The results of the trials indicate that the NB algorithm outperformed SVM, achieving an accuracy rate of 87%. Furthermore, in identifying cyberbullying patterns in bilingual text, NB reached the highest accuracy rate for the Indonesian language at 87%. These findings suggest that this study can serve as a reference for developing more accurate and responsive cyberbullying detection systems on bilingual social media platforms.

Full Text:

PDF

References


A. Lüders, A. Dinkelberg, and M. Quayle, ‘Becoming “Us” In Digital Spaces: How Online users Creatively and Strategically Exploit Social Media Affordances to Build Up Social Identity’, Act Psychologica, vol. 228, p. 103643, Aug. 2022, doi: 10.1016/j.actpsy.2022.103643.

N. Sabermajidi, N. Valaei, M. S. Balaji, and S. K. Goh, ‘Measuring Brand-Related Content in Social Media: a Socialization Theory Perspective’, Information Technology & People, vol. 33, no. 4, pp. 1281–1302, Jan. 2020, doi: 10.1108/ITP-10-2018-0497.

L. Stracqualursi and P. Agati, ‘Tweet Topics and Sentiments Relating to Distance Learning Among Italian Twitter Users’, Sci Rep, vol. 12, no. 1, p. 9163, Jun. 2022, doi: 10.1038/s41598-022-12915-w.

E. S. Matsa Sarah Naseer, Jacob Liedke and Katerina Eva, ‘How Americans Get News on TikTok, X, Facebook and Instagram’, Pew Research Center. Accessed: Aug. 11, 2024. [Online]. Available: https://www.pewresearch.org/journalism/2024/06/12/how-americans-get-news-on-tiktok-x-facebook-and-instagram/

R. Kullar, D. A. Goff, T. P. Gauthier, and T. C. Smith, ‘To Tweet or Not to Tweet—a Review of the Viral Power of Twitter for Infectious Diseases’, Curr Infect Dis Rep, vol. 22, no. 6, p. 14, Jun. 2020, doi: 10.1007/s11908-020-00723-0.

L. Stracqualursi and P. Agati, ‘Twitter Users Perceptions of AI-based E-Learning Technologies’, Scientific Reports, vol. 14, no. 1, pp. 1–14, 2024, doi: 10.1038/s41598-024-56284-y.

O. A. Alismaiel, ‘Digital Media used in Education: The Influence on Cyberbullying Behaviors among Youth Students’, IJERPH, vol. 20, no. 2, p. 1370, Jan. 2023, doi: 10.3390/ijerph20021370.

D. Kim, ‘Cyberbullying Behaviors in Online Travel Community: Members’ Perceptions and Sustainability in Online Community’, Sustainability, vol. 14, no. 9, p. 5220, Apr. 2022, doi: 10.3390/su14095220.

Ahmad Mohamad Alomar and Hassan Sami Alabady, ‘The Phenomenon of Cyber Bullying: Interpretation, Confrontation, and the Position of Islamic Law’, JNS, vol. 34, May 2023, doi: 10.59670/jns.v34i.1123.

Ahmad Mohammad Alomar et Al, ‘Aspect and Special Distinct Nature of Cyberbullying’, Russian Law Journal, vol. 11, no. 3, Art. no. 3, Apr. 2023, doi: 10.52783/rlj.v11i3.1817.

L. H. Collantes, Y. Martafian, S. N. Khofifah, T. Kurnia Fajarwati, N. T. Lassela, and M. Khairunnisa, ‘The Impact of Cyberbullying on Mental Health of the Victims’, in 2020 4th International Conference on Vocational Education and Training (ICOVET), Malang, Indonesia: IEEE, Sep. 2020, pp. 30–35. doi: 10.1109/ICOVET50258.2020.9230008.

M. P. Akhter, Z. Jiangbin, I. R. Naqvi, M. AbdelMajeed, and T. Zia, ‘Correction To: Abusive Language Detection From Social Media Comments using Conventional Machine Learning and Deep Learning Approaches’, Multimedia Systems, vol. 29, no. 1, pp. 451–451, Feb. 2023, doi: 10.1007/s00530-021-00819-0.

I. Awajan, M. Mohamad, and A. Al-Quran, ‘Sentiment Analysis Technique and Neutrosophic Set Theory for Mining and Ranking Big Data From Online Reviews’, IEEE Access, vol. 9, pp. 47338–47353, 2021, doi: 10.1109/ACCESS.2021.3067844.

Fathurahman Bei and Sudin Saepudin, ‘Analisis Sentimen Aplikasi Tiket Online di Play Store menggunakan Metode Support Vector Machine (SVM)’, 2021.

H. Hertina et al., ‘Data Mining Applied About Polygamy using Sentiment Analysis On Twitters In Indonesian Perception’, Bulletin EEI, vol. 10, no. 4, pp. 2231–2236, Aug. 2021, doi: 10.11591/eei.v10i4.2325.

B. AlBadani, R. Shi, and J. Dong, ‘A Novel Machine Learning Approach for Sentiment Analysis on Twitter Incorporating the Universal Language Model Fine-Tuning and SVM’, ASI, vol. 5, no. 1, p. 13, Jan. 2022, doi: 10.3390/asi5010013.

M. R. Romadhon and F. Kurniawan, ‘A Comparison of Naive Bayes Methods, Logistic Regression and KNN for Predicting Healing of Covid-19 Patients in Indonesia’, in 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), Surabaya, Indonesia: IEEE, Apr. 2021, pp. 41–44. doi: 10.1109/EIConCIT50028.2021.9431845.

A. Roihan, P. A. Sunarya, and A. S. Rafika, ‘Pemanfaatan Machine Learning dalam berbagai Bidang: Review paper’, IJCIT, vol. 5, no. 1, May 2020, doi: 10.31294/ijcit.v5i1.7951.

‘Survey on Dietary Application through Image Processing for Calorie Management’, International Journal of Advanced Research in Science, Communication and Technology, pp. 345–347, May 2022, doi: 10.48175/ijarsct-3666.

M. Muhathir, M. H. Santoso, and D. A. Larasati, ‘Wayang Image Classification using SVM Method and GLCM Feature Extraction’, Journal Of Informatics And Telecommunication Engineering, vol. 4, no. 2, pp. 373–382, 2021.

A. Muneer and S. M. Fati, ‘A Comparative Analysis of Machine Learning Techniques For Cyberbullying Detection on Twitter’, Future Internet, vol. 12, no. 11, pp. 1–21, 2020, doi: 10.3390/fi12110187.

N. Chamidah and R. Sahawaly, ‘Comparison Support Vector Machine and Naive Bayes Methods for Classifying Cyberbullying in Twitter’, Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, vol. 7, no. 2, p. 338, Sep. 2021, doi: 10.26555/jiteki.v7i2.21175.

B. A. Talpur and D. O’Sullivan, ‘Cyberbullying Severity Detection: A Machine Learning Approach’, PLoS ONE, vol. 15, no. 10 October, pp. 1–19, 2020, doi: 10.1371/journal.pone.0240924.

A. Perera and P. Fernando, ‘Cyberbullying Detection System on Social Media using Supervised Machine Learning’, Procedia Computer Science, vol. 239, pp. 506–516, 2024, doi: 10.1016/j.procs.2024.06.200.

C. Muehlethaler and R. Albert, ‘Collecting Data on Textiles from the Internet using Web Crawling and Web Scraping Tools’, Forensic Science International, vol. 322, p. 110753, 2021, doi: 10.1016/j.forsciint.2021.110753.

A. P. Natasuwarna, ‘Seleksi Fitur Support Vector Machine pada Analisis Sentimen Keberlanjutan Pembelajaran Daring’, Techno.Com, vol. 19, no. 4, pp. 437–448, 2020, doi: 10.33633/tc.v19i4.4044.

U. Naseem, I. Razzak, and P. W. Eklund, ‘A Survey Of Pre-Processing Techniques to Improve Short-Text Quality: A Case Study On Hate Speech Detection On Twitter’, Multimedia Tools and Applications, vol. 80, no. 28–29, pp. 35239–35266, 2021, doi: 10.1007/s11042-020-10082-6.

K. Maharana, S. Mondal, and B. Nemade, ‘A review: Data Pre-Processing and Data Augmentation Techniques’, Global Transitions Proceedings, vol. 3, no. 1, pp. 91–99, 2022, doi: 10.1016/j.gltp.2022.04.020.

D. Alita and A. R. Isnain, ‘Pendeteksian Sarkasme pada Proses Analisis Sentimen menggunakan Random Forest Classifier’, Jurnal Komputasi, vol. 8, no. 2, pp. 50–58, 2020, doi: 10.23960/komputasi.v8i2.2615.

D. J. Ladani and N. P. Desai, ‘Stopword Identification and Removal Techniques on TC and IR Applications: A Survey’, 2020 6th International Conference on Advanced Computing and Communication Systems, ICACCS 2020, pp. 466–472, 2020, doi: 10.1109/ICACCS48705.2020.9074166.

Y. A. Singgalen, ‘Analisis Sentimen Konsumen terhadap Food, Services, and Value di Restoran dan Rumah Makan Populer Kota Makassar Berdasarkan Rekomendasi Tripadvisor menggunakan Metode CRISP-DM dan SERVQUAL’, Building of Informatics, Technology and Science (BITS), vol. 4, no. 4, pp. 1899–1914, 2023, doi: 10.47065/bits.v4i4.3231.

L. Hickman, S. Thapa, L. Tay, M. Cao, and P. Srinivasan, ‘Text Preprocessing for Text Mining in Organizational Research: Review and Recommendations’, Organizational Research Methods, vol. 25, no. 1, pp. 114–146, 2022, doi: 10.1177/1094428120971683.

M. Kamyab, G. Liu, and M. Adjeisah, ‘Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis’, Applied Sciences (Switzerland), vol. 11, no. 23, 2021, doi: 10.3390/app112311255.

M. Liang and T. Niu, ‘Research on Text Classification Techniques Based on Improved TF-IDF Algorithm and LSTM Inputs’, Procedia Computer Science, vol. 208, pp. 460–470, 2022, doi: 10.1016/j.procs.2022.10.064.

I. Wickramasinghe and H. Kalutarage, ‘Naive Bayes: Applications, Variations and Vulnerabilities: A Review of Literature with Code Snippets for Implementation’, Soft Comput, vol. 25, no. 3, pp. 2277–2293, Feb. 2021, doi: 10.1007/s00500-020-05297-6.

W. A. Prabowo and C. Wiguna, ‘Sistem Informasi UMKM Bengkel Berbasis Web menggunakan Metode SCRUM’, mib, vol. 5, no. 1, p. 149, Jan. 2021, doi: 10.30865/mib.v5i1.2604.

O. Baines, ‘Naïve Bayes: Machine Learning and Text Classification Application of Bayes’ Theorem’.

J. Suzuki, ‘Support Vector Machine’, in Statistical Learning with Math and R: 100 Exercises for Building Logic, Singapore: Springer Nature Singapore, 2020, pp. 171–192. doi: 10.1007/978-981-15-7568-6_9.

S. Rabbani, D. Safitri, N. Rahmadhani, A. A. F. Sani, and M. K. Anam, ‘Perbandingan Evaluasi Kernel SVM untuk Klasifikasi Sentimen dalam Analisis Kenaikan Harga BBM’, MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 3, no. 2, pp. 153–160, 2023, doi: 10.57152/malcom.v3i2.897.




DOI: https://doi.org/10.32520/stmsi.v14i1.4799

Article Metrics

Abstract view : 66 times
PDF - 48 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.