Learning Extended Term Frequency-Inverse Document Frequency (TF-IDF++) for Depression Screening From Sentences in Thai Blog Post
| dc.contributor.author | Sahussawud Khunruksa | |
| dc.contributor.author | Somkiat Wangsiripitak | |
| dc.date.accessioned | 2026-05-08T19:20:20Z | |
| dc.date.issued | 2023-5-18 | |
| dc.description.abstract | This paper proposed the method of depression screening from a sentence in Thai blog posts. Three classifiers based on a decision tree, linear SVC, and logistic regression were used to create classification models; each learned from extended term frequency-inverse document frequency (TF-IDF++) which is a feature vector created from a term frequency-inverse document frequency (TF-IDF), part-of-speech, and statistics of sentences such as word counts of selected terms. Our experiments showed that the model based on logistic regression achieves the top average score with a precision of 78.32%, a recall of 78.26%, and an f1-score of 78.27%. The proposed method outperforms the Thai BERT model by 0.75%, 0.77%, and 0.76%, respectively. Our investigation also showed that excessive confidence in the Thai BERT model tends to classify a sample with high probability. This also happens in case of an incorrect prediction; the error in such a case becomes noticeably higher than that of the wrong prediction in our proposed logistic regression-based model. | |
| dc.identifier.doi | 10.1109/icbir57571.2023.10147692 | |
| dc.identifier.uri | https://dspace.kmitl.ac.th/handle/123456789/17484 | |
| dc.subject | Topic Modeling | |
| dc.subject | Sentiment Analysis and Opinion Mining | |
| dc.subject | Mental Health via Writing | |
| dc.title | Learning Extended Term Frequency-Inverse Document Frequency (TF-IDF++) for Depression Screening From Sentences in Thai Blog Post | |
| dc.type | Article |