Hierarchical Convolutional Neural Networks using CCP-3 Block Architecture for Apparel Image Classification

Natthamon Chamnong; Jeeraporn Werapun; Anantaporn Hanskunatai

doi:10.14569/ijacsa.2023.0140624

Hierarchical Convolutional Neural Networks using CCP-3 Block Architecture for Apparel Image Classification

Date

2023-01-01

Authors

Natthamon Chamnong

Jeeraporn Werapun

Anantaporn Hanskunatai

Abstract

In fashion applications, deep learning has been applied automatically to recognize and classify the apparel images under the massive visual data, emerged on social networks. To classify the apparel correctly and quickly is challenging due to a variety of apparel features and complexity of the classification. Recently, the hierarchical convolutional neural networks (H–CNN) with the VGGNet architecture was proposed to classify the fashion-MNIST datasets. However, the VGGNet (many layers) required many filters (in the convolution layer) and many neurons (in the fully connected layer), leading to computational complexity and long training-time. Therefore, this paper proposes to classify the apparel images by the H–CNN in cooperated with the new shallow-layer CCP-3-Block architecture, where each building block consists of two convolutional layers (CC) and one pooling layer (P). In the CCP-3-Block, the number of layers can be reduced (in the network), the number of filters (in the convolution layer), and the number of neurons (in the fully connected layer), while adding a new connection between the convolution layer and the pooling layer plus a batch-normalization technique before passing the activation so that networks can learn independently and train quickly. Moreover, dropout techniques were utilized in the feature mapping and fully connected to reduce overfitting, and the optimizer adaptive moment estimation was utilized to solve the decaying of gradients, which can improve the network-performance. The experimental results showed that the improved H–CNN model with our CCP-3-Block outperformed the recent H–CNN model with the VGGNet in terms of decreased loss, increased accuracy, and faster training.