Cyberbullying Detection on Instagram using IndoBERTa Model

Authors

Keywords:

cyberbullying, IndoBERTa, RoBERTa, text classification, social media

Abstract

Cyberbullying on social media is an increasingly worrying issue, especially among teenagers. Automatic detection of offensive content is important to create a safe digital space. This research aims to develop a cyberbullying detection system in Indonesian by utilizing the latest transformer model, IndoBERTa. The dataset used consists of Instagram comments that have been labeled as bullying or non-bullying. The pre-processing process includes text cleaning, slang normalization, and stopword removal. The IndoBERTa model was then fine-tuned and tested using evaluation metrics such as accuracy, precision, recall, and F1-score. Results showed that the model was able to achieve 87% accuracy with an F1-score of 0.87, outperforming classic machine learning-based approaches. This finding is in line with previous studies that show the effectiveness of transformer models in Indonesian text classification, especially for detecting negative speech. This research contributes to the development of artificial intelligence-based content moderation systems in Indonesian social media

Published

2025-08-04