Configurable Hardware Architecture of Multidimensional Convolution Coprocessor

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

We propose a configurable coprocessor for the convolutional neural network (CNN) that suit various models of CNN. It can operate 2D standard convolution, 2D depthwise separable convolution, 3D convolution, and a fully connected layer. The proposed processing cluster consists of 72 processing units (PUs) of half-precision floating-point to assist the main processor in embedded systems. The experimental results on Artix-7 FPGA revealed that our design has 12.16 GOPs per cluster. Moreover, this architecture was designed to be scalable for the systems with higher performance.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By