Simplified Approach to Constructing Coherent Topics and Subtopics from Text Data: A Case Study Using University Reviews
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study introduces a streamlined framework for analyzing hierarchical topic structures in text data, integrating Latent Dirichlet Allocation (LDA), Word2Vec, Bigram phasing, Doc2Vec, and hierarchical clustering. The method ensures both statistical coherence and practical interpretability while avoiding the complexities of traditional hierarchical topic models. Applied to university reviews from various educational platforms, this data offers valuable insights into user experiences but presents challenges due to its unstructured nature. Our framework reveals key topics and sentiment variations: positive feedback highlights facilities and cultural experiences, while negative reviews emphasize workload, academic challenges, and financial pressures, identifying areas for improvement. This approach is particularly effective for moderately sized datasets with well-defined scopes, such as university reviews, where the subject matter is clearly understood.