Improving a text classifier using text augmentation: road traffic content from Twitter

dc.contributor.author	Thawatchai Raksachat
dc.contributor.author	Rathachai Chawuthai
dc.date.accessioned	2026-05-08T19:20:51Z
dc.date.issued	2023-5-9
dc.description.abstract	The purpose of this study is to develop a more effective method for categorizing Thai-language tweets related to traffic. The categorization consists of five categories. Previous studies have utilized CNN and BERT for classification, but have faced the challenge of needing balanced data for improved performance. To address this, we propose the use of BPEmb to augmentation the data and calculate cosine similarity. The subsequent step will be to create a balanced dataset to train a combination of CNN and bi-LSTM models for tweet classification. Our experiment demonstrates a significant improvement in tweet classification with a 14.3% increase in F1-score compared to the baseline method.
dc.identifier.doi	10.1109/ecti-con58255.2023.10153191
dc.identifier.uri	https://dspace.kmitl.ac.th/handle/123456789/17715
dc.subject	Sentiment Analysis and Opinion Mining
dc.subject	Text and Document Classification Technologies
dc.subject	Natural Language Processing Techniques
dc.title	Improving a text classifier using text augmentation: road traffic content from Twitter
dc.type	Article

Collections