Improving a text classifier using text augmentation: road traffic content from Twitter

dc.contributor.authorThawatchai Raksachat
dc.contributor.authorRathachai Chawuthai
dc.date.accessioned2026-05-08T19:20:51Z
dc.date.issued2023-5-9
dc.description.abstractThe purpose of this study is to develop a more effective method for categorizing Thai-language tweets related to traffic. The categorization consists of five categories. Previous studies have utilized CNN and BERT for classification, but have faced the challenge of needing balanced data for improved performance. To address this, we propose the use of BPEmb to augmentation the data and calculate cosine similarity. The subsequent step will be to create a balanced dataset to train a combination of CNN and bi-LSTM models for tweet classification. Our experiment demonstrates a significant improvement in tweet classification with a 14.3% increase in F1-score compared to the baseline method.
dc.identifier.doi10.1109/ecti-con58255.2023.10153191
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/17715
dc.subjectSentiment Analysis and Opinion Mining
dc.subjectText and Document Classification Technologies
dc.subjectNatural Language Processing Techniques
dc.titleImproving a text classifier using text augmentation: road traffic content from Twitter
dc.typeArticle

Files

Collections