<i>HateThaiSent</i>: Sentiment-Aided Hate Speech Detection in Thai Language

dc.contributor.authorKrishanu Maity
dc.contributor.authorA. S. Poornash
dc.contributor.authorShaubhik Bhattacharya
dc.contributor.authorSalisa Phosit
dc.contributor.authorSawarod Kongsamlit
dc.contributor.authorSriparna Saha
dc.contributor.authorKitsuchart Pasupa
dc.date.accessioned2026-05-08T19:15:44Z
dc.date.issued2024-4-8
dc.description.abstractSocial media platforms are a double-edged sword: on the one hand, they enable the dissemination of information; but on the other hand, they also provide an avenue for spreading online abuse and harassment, such as hate speech. While significant research efforts are being devoted to detecting online hate speech in the English language, little attention has been paid to the Thai language. In this study, we created a benchmark dataset, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">HateThaiSent</i> , which labels each post with both hate speech and sentiment information. To detect hate speech, we created a multitask model that uses a dual-channel deep learning approach based on FastText and BERT embeddings, with an added capsule network. One channel utilizes pretrained FastText embeddings while the other uses embeddings from the BERT language model. We aimed to answer two research questions: (Q1) Does incorporating sentiment information improves the performance of hate speech detection (HD) in the Thai language? (Q2) What is the comparative effectiveness of two different approaches for sentiment-aware HD in the Thai language: feature engineering versus multitasking? Our proposed approach outperformed other baselines and state-of-the-art models on the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">HateThaiSent</i> dataset, with overall accuracy/macro- <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">F</i> 1 values of 89.67%/89.79%, and 80.92%/80.97% for hate speech and sentiment detection tasks, respectively. We concluded that multitasking is more effective than feature engineering in enhancing the performance of the main task (HD).
dc.identifier.doi10.1109/tcss.2024.3376958
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/15198
dc.publisherIEEE Transactions on Computational Social Systems
dc.subjectHate Speech and Cyberbullying Detection
dc.title<i>HateThaiSent</i>: Sentiment-Aided Hate Speech Detection in Thai Language
dc.typeArticle

Files

Collections