View article

Curbing the roofline: a scalable and flexible architecture for CNNs on FPGA

Authors

Paolo Meloni, Gianfranco Deriu, Francesco Conti, Igor Loi, Luigi Raffo, Luca Benini

Publication date

2016/5/16

Book

Proceedings of the ACM International Conference on Computing Frontiers

Pages

376-383

Description

Convolutional Neural Networks (CNNs) have reached outstanding results in several complex visual recognition tasks, such as classification and scene parsing. CNNs are composed of multiple filtering layers that perform 2D convolutions over input images. The intrinsic parallelism in such a computation kernel makes it suitable to be effectively accelerated on parallel hardware. In this paper we propose a highly flexible and scalable architectural template for acceleration of CNNs on FPGA devices, based on the cooperation between a set of software cores and a parallel convolution engine that communicate via a tightly coupled L1 shared scratchpad. Our accelerator structure, tested on a Xilinx Zynq XC-Z7045 device, delivers peak performance up to 80 GMAC/s, corresponding to 100 MMAC/s for each DSP slice in the programmable fabric. Thanks to the flexible architecture, convolution operations can be scheduled …

Total citations

Cited by 27

2016201720182019202020212022202320242 3 4 8 4 2 1 2 1

Scholar articles

Curbing the roofline: a scalable and flexible architecture for CNNs on FPGA

P Meloni, G Deriu, F Conti, I Loi, L Raffo, L Benini - Proceedings of the ACM International Conference on …, 2016