Code Smell Classification Using Graph Convolutional Network with Imbalanced Data and Model Integration
| dc.contributor.author | Phawinee Suphawimon | |
| dc.contributor.author | Tuchsanai Ploysuwan | |
| dc.date.accessioned | 2026-05-08T19:26:05Z | |
| dc.date.issued | 2025-11-12 | |
| dc.description.abstract | Code smell represents critical design anomalies that significantly impact software maintainability and quality. This paper presents a comprehensive framework using Graph Convolutional Networks (GCNs) integrated with traditional machine learning techniques. We systematically evaluated graph construction approaches, model integration methodologies, and data balancing strategies using nine real-world Python repositories labeled with PyExamine. Our methodology combines BERT embeddings with graph structural representations, implementing layer integration (Method I) and feature concatenation (Method II). Results show per-line graph construction outperforms global approaches, with SMOTE achieving 96.14% accuracy compared to 86.70% for imbalanced data. Including non-smelly code improves performance from 71% to 95%, demonstrating the importance of negative examples. Our ablation study shows explicit feature engineering achieves only 67% accuracy compared to 95% for end-to-end learning. The integrated GCN with Transformer using Method II achieved 95% accuracy and 89% F1-score, nearly matching CodeT5 (97% accuracy, 85% F1-score) while providing better interpretability. | |
| dc.identifier.doi | 10.1109/isai-nlp66160.2025.11320518 | |
| dc.identifier.uri | https://dspace.kmitl.ac.th/handle/123456789/20405 | |
| dc.subject | Software Engineering Research | |
| dc.subject | Advanced Malware Detection Techniques | |
| dc.subject | Software Testing and Debugging Techniques | |
| dc.title | Code Smell Classification Using Graph Convolutional Network with Imbalanced Data and Model Integration | |
| dc.type | Article |