Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai

Sivakorn Wangwon; Rapepong Pitijaroonpong; Piyawat Chuangkrud; Chaianun Damrongrat; Sarawoot Kongyoung; Nont Kanungsukkasem

doi:10.1109/icitee66631.2025.11338229

Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai

dc.contributor.author	Sivakorn Wangwon
dc.contributor.author	Rapepong Pitijaroonpong
dc.contributor.author	Piyawat Chuangkrud
dc.contributor.author	Chaianun Damrongrat
dc.contributor.author	Sarawoot Kongyoung
dc.contributor.author	Nont Kanungsukkasem
dc.date.accessioned	2026-05-08T19:26:07Z
dc.date.issued	2025-10-20
dc.description.abstract	Large Language Models (LLMs) face critical reliability challenges due to hallucination-the generation of factually inaccurate content. While hallucination detection has advanced for major languages, Thai remains underserved, lacking specialized alignment models and datasets. We introduce the Thai Generalized Alignment Task (TGAT), a seven-subtask corpus tailored for Thai hallucination detection, spanning Fact Verification, Natural Language Inference, Information Retrieval, Question Answering, Summarization, Semantic Textual Similarity, and Paraphrase Identification. We conduct a comparative study across transformer families-encoder-only, encoder-decoder, and decoder-only-evaluated on the test set of each sub-task to analyze architectural compatibility and generalization for Thai hallucination detection. We also perform an ablation study to quantify how the presence of each subtask dataset affects the overall average performance, clarifying the role of dataset composition. Our experiments show that decoder-only models consistently outperform encoder-decoder and encoder-only alternatives under a standardized zero-shot prompting setup, establishing strong baselines and offering guidance for model selection and dataset design in low-resource settings.
dc.identifier.doi	10.1109/icitee66631.2025.11338229
dc.identifier.uri	https://dspace.kmitl.ac.th/handle/123456789/20441
dc.subject	Topic Modeling
dc.subject	Mental Health via Writing
dc.subject	Text Readability and Simplification
dc.title	Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai
dc.type	Article

Collections

All

Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai

Files

Collections