Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai
| dc.contributor.author | Sivakorn Wangwon | |
| dc.contributor.author | Rapepong Pitijaroonpong | |
| dc.contributor.author | Piyawat Chuangkrud | |
| dc.contributor.author | Chaianun Damrongrat | |
| dc.contributor.author | Sarawoot Kongyoung | |
| dc.contributor.author | Nont Kanungsukkasem | |
| dc.date.accessioned | 2026-05-08T19:26:07Z | |
| dc.date.issued | 2025-10-20 | |
| dc.description.abstract | Large Language Models (LLMs) face critical reliability challenges due to hallucination-the generation of factually inaccurate content. While hallucination detection has advanced for major languages, Thai remains underserved, lacking specialized alignment models and datasets. We introduce the Thai Generalized Alignment Task (TGAT), a seven-subtask corpus tailored for Thai hallucination detection, spanning Fact Verification, Natural Language Inference, Information Retrieval, Question Answering, Summarization, Semantic Textual Similarity, and Paraphrase Identification. We conduct a comparative study across transformer families-encoder-only, encoder-decoder, and decoder-only-evaluated on the test set of each sub-task to analyze architectural compatibility and generalization for Thai hallucination detection. We also perform an ablation study to quantify how the presence of each subtask dataset affects the overall average performance, clarifying the role of dataset composition. Our experiments show that decoder-only models consistently outperform encoder-decoder and encoder-only alternatives under a standardized zero-shot prompting setup, establishing strong baselines and offering guidance for model selection and dataset design in low-resource settings. | |
| dc.identifier.doi | 10.1109/icitee66631.2025.11338229 | |
| dc.identifier.uri | https://dspace.kmitl.ac.th/handle/123456789/20441 | |
| dc.subject | Topic Modeling | |
| dc.subject | Mental Health via Writing | |
| dc.subject | Text Readability and Simplification | |
| dc.title | Thai Generalized Alignment Task (TGAT): A Corpus and Comparative Study for Hallucination Detection in Thai | |
| dc.type | Article |