Baseline Performance of Pre-trained Models on Movie Genre Classification from Spectrograms
| dc.contributor.author | Porawat Visutsak | |
| dc.contributor.author | Kavin Treeraphapkajondet | |
| dc.contributor.author | Visaroot Sakphet | |
| dc.contributor.author | Wachirawit Nitinuntatip | |
| dc.contributor.author | Pawwinkan Satthong | |
| dc.contributor.author | Tanajak Tongbai | |
| dc.contributor.author | Duongduen Ongrungruaeng | |
| dc.contributor.author | Atiwitch Juntra | |
| dc.contributor.author | Watcharaporn Aiamlamai | |
| dc.contributor.author | Issares Sungwanna | |
| dc.contributor.author | Prapaporn Phetrak | |
| dc.contributor.author | Ponrudee Netisopakul | |
| dc.contributor.author | Keun Ho Ryu | |
| dc.date.accessioned | 2026-05-08T19:24:46Z | |
| dc.date.issued | 2025-4-30 | |
| dc.description.abstract | This study investigates the use of deep learning for classifying movie genres based on audio spectrograms. We construct a dataset of movie trailers, transform them into spectrograms, and label them by genre. Then, we utilize MATLAB's pre-trained convolutional neural networks (CNNs) for clas- sication, comparing the performance of 9 different architectures, including MobileNet-v2, RestNet-18, DenseNet-201, Places365-GoogLeNet, VGG- 16, VGG-19, Inception-RestNet-v2, Inception-v3, and NASANet-Mobile. We evaluated all models based on their ability to classify movie trailers into ve genres: action, romance, drama, comedy, and thriller. Our results, based on accuracy and F1-score across genres, indicate that VGG16 achieves the highest overall performance with an accuracy of 86.27%, an F1-score of 86.69%, a recall of 86.87%, and a precision of 87.28%. This research demonstrates the potential of leveraging pre-trained CNNs, particularly VGG-16, for efficient and effective audio-based genre classification in movie trailers. | |
| dc.identifier.doi | 10.37936/ecti-cit.2025192.259990 | |
| dc.identifier.uri | https://dspace.kmitl.ac.th/handle/123456789/19756 | |
| dc.publisher | ECTI Transactions on Computer and Information Technology (ECTI-CIT) | |
| dc.subject | Generative Adversarial Networks and Image Synthesis | |
| dc.subject | Media Influence and Health | |
| dc.subject | Cinema and Media Studies | |
| dc.title | Baseline Performance of Pre-trained Models on Movie Genre Classification from Spectrograms | |
| dc.type | Article |