myOCR: Optical Character Recognition for Myanmar language with Post-OCR Error Correction

dc.contributor.authorThura Aung
dc.contributor.authorYe Kyaw Thu
dc.contributor.authorMyat Noe Oo
dc.date.accessioned2026-05-08T19:20:30Z
dc.date.issued2024-11-11
dc.description.abstractThis paper presents the Myanmar Optical Character Recognition (OCR), named myOCR. It utilizes a synthetic text image dataset with 14 different font styles that contains 25,790 text images. The system includes Convolutional Neural Networks (CNN) for feature extraction, Bidirectional Long-Short Term Memory (BiLSTM) networks for sequence modeling, and Connectionist Temporal Classification (CTC) for decoding, evaluated across various iterations (3,000, 6,000, 9,000) and hidden states (64, 128, 256). Statistical Post-OCR correction methods involve N(3,4,5)-grams and edit distances with the Symmetric Delete Spelling correction algorithm (SymSpell). For Neural Machine Translation-based correction, BiLSTM and Transformer models are employed, while the mT5-base and mBART-50 models are used for LLM-based correction. The best base (optical) model is the model with 9,000 iterations that achieved a chrF<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">++</sup> score of over 97.90 and a Word Error Rate (WER) of 9.18%. Transformer correction improved its chrF<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">++</sup> to 99.31 and reduced the WER to 0.66%.
dc.identifier.doi10.1109/isai-nlp64410.2024.10799448
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/17573
dc.subjectHandwritten Text Recognition Techniques
dc.subjectSpeech Recognition and Synthesis
dc.subjectComputer Science and Engineering
dc.titlemyOCR: Optical Character Recognition for Myanmar language with Post-OCR Error Correction
dc.typeArticle

Files

Collections