Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai

dc.contributor.authorSivakorn Wangwon
dc.contributor.authorRapepong Pitijaroonpong
dc.contributor.authorPiyawat Chuangkrud
dc.contributor.authorChaianun Damrongrat
dc.contributor.authorSarawoot Kongyoung
dc.contributor.authorNont Kanungsukkasem
dc.date.accessioned2026-05-08T19:26:07Z
dc.date.issued2025-10-20
dc.description.abstractLarge Language Models (LLMs) face critical reliability challenges due to hallucination-the generation of factually inaccurate content. While hallucination detection has advanced for major languages, Thai remains underserved, lacking specialized alignment models and datasets. We introduce the Thai Generalized Alignment Task (TGAT), a seven-subtask corpus tailored for Thai hallucination detection, spanning Fact Verification, Natural Language Inference, Information Retrieval, Question Answering, Summarization, Semantic Textual Similarity, and Paraphrase Identification. We conduct a comparative study across transformer families-encoder-only, encoder-decoder, and decoder-only-evaluated on the test set of each sub-task to analyze architectural compatibility and generalization for Thai hallucination detection. We also perform an ablation study to quantify how the presence of each subtask dataset affects the overall average performance, clarifying the role of dataset composition. Our experiments show that decoder-only models consistently outperform encoder-decoder and encoder-only alternatives under a standardized zero-shot prompting setup, establishing strong baselines and offering guidance for model selection and dataset design in low-resource settings.
dc.identifier.doi10.1109/icitee66631.2025.11338229
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/20441
dc.subjectTopic Modeling
dc.subjectMental Health via Writing
dc.subjectText Readability and Simplification
dc.titleThai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai
dc.typeArticle

Files

Collections