Hate Speech Detection in Thai Social Media with Ordinal-Imbalanced Text Classification

dc.contributor.authorKitsuchart Pasupa
dc.contributor.authorWerasut Karnbanjob
dc.contributor.authorMassakorn Aksornsiri
dc.date.accessioned2026-05-08T19:17:32Z
dc.date.issued2022-6-22
dc.description.abstractCyberbullying has become a serious problem in Thai social media. For example, some Thai people posted hate speeches on Myanmar workers in Thailand during the COVID-19 pandemic, which might elevate hate crime. It is imperative and urgent to detect cyberbullying on Thai social media. The task is a text classification problem. Moreover, hate speeches contain the order of severity levels, but many pieces of work did not consider this point in the model. Therefore, we developed a Thai hate-speech classification method with various loss functions to detect such hate speeches accurately. We evaluated them on a corpus of ordinal-imbalanced Thai text. The evaluated outcomes indicated that the best-in terms of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$F$</tex> <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> -score-model was the model with a loss function of a hybrid between an Ordinal regression loss function and Pearson correlation coefficients (common in similarity function). It yielded an average F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> -score of 78.38 %-0.88 % significantly higher than the score achieved by a conventional loss function-and an average mean squared error of 0.2478-5.49 % relative improvement. Thus, the proposed hybrid loss function improved the efficiency of the model.
dc.identifier.doi10.1109/jcsse54890.2022.9836312
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/16078
dc.subjectHate Speech and Cyberbullying Detection
dc.subjectInternet Traffic Analysis and Secure E-voting
dc.subjectSentiment Analysis and Opinion Mining
dc.titleHate Speech Detection in Thai Social Media with Ordinal-Imbalanced Text Classification
dc.typeArticle

Files

Collections