Vision Transformer with Fractal Dimension Transformation: Effects of Resolution and Patch Size

dc.contributor.authorWoramat Ngamkham
dc.contributor.authorKuntpong Woraratpanya
dc.contributor.authorYoshimitsu Kuroki
dc.date.accessioned2026-05-08T19:26:07Z
dc.date.issued2025-10-20
dc.description.abstractVision Transformer (ViT) achieves strong performance in computer vision but requires substantial computational resources, particularly with high-resolution data. A key challenge lies in the quadratic complexity of self-attention with respect to the number of image patches, which is jointly determined by input size and patch size. Conventional resizing is a common strategy to reduce resolution and thus the number of patches, but it risks discarding structural details that may be important for prediction. To address this issue, this study investigates how input size, patch size, and dimensionality reduction influence ViT training time and prediction accuracy. Using the NIH Chest X-ray dataset, we compared two preprocessing methods: conventional resizing and a Fractal Dimension (FD)-based transformation. Results show that the FD-based method consistently reduced training time across all settings, demonstrating its effectiveness in lowering computational costs. In terms of accuracy, conventional resizing generally performed slightly better overall; however, the differences were not uniform, as smaller patches improved AUROC mainly at higher resolutions but not consistently at lower ones. These findings highlight a tradeoff between efficiency and accuracy, positioning FD-based representations as a practical complement to conventional resizing when computational resources are limited.
dc.identifier.doi10.1109/icitee66631.2025.11338254
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/20442
dc.subjectCell Image Analysis Techniques
dc.subjectAdvanced Neural Network Applications
dc.subjectMedical Image Segmentation Techniques
dc.titleVision Transformer with Fractal Dimension Transformation: Effects of Resolution and Patch Size
dc.typeArticle

Files

Collections