Code Smell Classification Using Graph Convolutional Network with Imbalanced Data and Model Integration

dc.contributor.authorPhawinee Suphawimon
dc.contributor.authorTuchsanai Ploysuwan
dc.date.accessioned2026-05-08T19:26:05Z
dc.date.issued2025-11-12
dc.description.abstractCode smell represents critical design anomalies that significantly impact software maintainability and quality. This paper presents a comprehensive framework using Graph Convolutional Networks (GCNs) integrated with traditional machine learning techniques. We systematically evaluated graph construction approaches, model integration methodologies, and data balancing strategies using nine real-world Python repositories labeled with PyExamine. Our methodology combines BERT embeddings with graph structural representations, implementing layer integration (Method I) and feature concatenation (Method II). Results show per-line graph construction outperforms global approaches, with SMOTE achieving 96.14% accuracy compared to 86.70% for imbalanced data. Including non-smelly code improves performance from 71% to 95%, demonstrating the importance of negative examples. Our ablation study shows explicit feature engineering achieves only 67% accuracy compared to 95% for end-to-end learning. The integrated GCN with Transformer using Method II achieved 95% accuracy and 89% F1-score, nearly matching CodeT5 (97% accuracy, 85% F1-score) while providing better interpretability.
dc.identifier.doi10.1109/isai-nlp66160.2025.11320518
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/20405
dc.subjectSoftware Engineering Research
dc.subjectAdvanced Malware Detection Techniques
dc.subjectSoftware Testing and Debugging Techniques
dc.titleCode Smell Classification Using Graph Convolutional Network with Imbalanced Data and Model Integration
dc.typeArticle

Files

Collections