Authors
Yunxuan Yu, Chen Wu, Tiandong Zhao, Kun Wang, Lei He
Publication date
2020/1
Journal
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume
28
Issue
1
Pages
35-47
Publisher
IEEE
Description
Field-programmable gate array (FPGA) provides rich parallel computing resources with high energy efficiency, making it ideal for deep convolutional neural network (CNN) acceleration. In recent years, automatic compilers have been developed to generate network-specific FPGA accelerators. However, with more cascading deep CNN algorithms adapted by various complicated tasks, reconfiguration of FPGA devices during runtime becomes unavoidable when network-specific accelerators are employed. Such reconfiguration can be difficult for edge devices. Moreover, network-specific accelerator means regeneration of RTL code and physical implementation whenever the network is updated. This is not easy for CNN end users. In this article, we propose a domain-specific FPGA overlay processor, named OPU to accelerate CNN networks. It offers software-like programmability for CNN end users, as CNN …
Total citations
20192020202120222023202411621273613
Scholar articles
Y Yu, C Wu, T Zhao, K Wang, L He - IEEE Transactions on Very Large Scale Integration …, 2019