Authors
Deepika Selvaraj, Arunachalam Venkatesan, David Novo
Publication date
2023/1/4
Book
International Conference on Computer, Communication, and Signal Processing
Pages
94-108
Publisher
Springer Nature Switzerland
Description
CNN-based inference engine’s performance and efficiency always depend on the computational and dataflow-control complexity. Instead of considering a 2-dimensional (2D) feature array for processing, a 3D array of features/weights would improve the dataflow movement & memory computation. The optimum 8 × 8 × 32 3D-feature array size was chosen based on the factor of on-chip memory requirement, data reuse, and PE utilization. Using the optimum 8 × 8 × 32 feature array, seven different combinations of data-flow scheduling strategies were analyzed by varying row, column, and depth-wise parameters on the workload model using a MATLAB environment. From the analysis, strategy-V (depth-wise parallel & row/column-wise sequence) is found to be the best with a 4 × 8 processor array. Compared to the state-of-the-art processor strategy, strategy-V achieves the data transfer rate (off-chip to on …
Scholar articles