View article

[PDF] from hal.science

High-efficiency convolutional ternary neural networks with custom adder trees and weight compression

Authors

Adrien Prost-Boucle, Alban Bourge, Frédéric Pétrot

Publication date

2018/12/12

Journal

ACM Transactions on Reconfigurable Technology and Systems (TRETS)

Volume

Issue

Pages

1-24

Publisher

ACM

Description

Although performing inference with artificial neural networks (ANN) was until quite recently considered as essentially compute intensive, the emergence of deep neural networks coupled with the evolution of the integration technology transformed inference into a memory bound problem. This ascertainment being established, many works have lately focused on minimizing memory accesses, either by enforcing and exploiting sparsity on weights or by using few bits for representing activations and weights, to be able to use ANNs inference in embedded devices. In this work, we detail an architecture dedicated to inference using ternary {−1, 0, 1} weights and activations. This architecture is configurable at design time to provide throughput vs. power trade-offs to choose from. It is also generic in the sense that it uses information drawn for the target technologies (memory geometries and cost, number of available cuts, etc …

Total citations

Cited by 29

20182019202020212022202320241 3 9 5 3 5 3

Scholar articles

High-efficiency convolutional ternary neural networks with custom adder trees and weight compression

A Prost-Boucle, A Bourge, F Pétrot - ACM Transactions on Reconfigurable Technology and …, 2018