Improving OpenAI’s Whisper Model for Transcribing Homophones in Legal News

dc.contributor.authorLattapon Siriket
dc.contributor.authorKulsawasd Jitkajornwanich
dc.contributor.authorSaichon Jaiyen
dc.contributor.authorSarun Intakosum
dc.date.accessioned2026-05-08T19:20:27Z
dc.date.issued2024-5-1
dc.description.abstractThe “Whisper“ model provides a tool for those who require transcription of human voice. It equips with opensource features and diverse functionalities. The model is capable of effectively deciphering messages in multiple languages, including support for the Thai language. This paper focuses on improving the transcription process of Thai homophones using the Whisper model in reducing the word error rate (WER). We focus on words in the legal news category and identify factors that lead to Whisper’s incorrect sound predictions. We examined homophones using snippets of legal news video clips and compiled them into a homophone dictionary. We compare words extracted from the Whisper model by determining the word error rate and spelling of words. Based on the initial results obtained from the original Whisper model and the created homophone dictionary, $48 \%$ of the words were incorrectly transcribed out of a total of 94 words. Then, we propose a methodology by which the performance of the Whisper is improved. That way, the automatic speech recognition of Thai language using the Whisper model can fully be utilized and used in other applications.
dc.identifier.doi10.1109/iceast61342.2024.10554018
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/17532
dc.subjectArtificial Intelligence in Law
dc.subjectComparative and International Law Studies
dc.subjectLegal Education and Practice Innovations
dc.titleImproving OpenAI’s Whisper Model for Transcribing Homophones in Legal News
dc.typeArticle

Files

Collections