View article

[PDF] from neurips.cc

Runtime neural pruning

Authors

Ji Lin, Yongming Rao, Jiwen Lu, Jie Zhou

Publication date

2017

Journal

Advances in neural information processing systems

Volume

Description

In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime. Unlike existing neural pruning methods which produce a fixed pruned model for deployment, our method preserves the full ability of the original network and conducts pruning according to the input image and current feature maps adaptively. The pruning is performed in a bottom-up, layer-by-layer manner, which we model as a Markov decision process and use reinforcement learning for training. The agent judges the importance of each convolutional kernel and conducts channel-wise pruning conditioned on different samples, where the network is pruned more when the image is easier for the task. Since the ability of network is fully preserved, the balance point is easily adjustable according to the available resources. Our method can be applied to off-the-shelf network structures and reach a better tradeoff between speed and accuracy, especially with a large pruning rate.

Total citations

Cited by 570

201820192020202120222023202424 72 102 110 120 89 51

Scholar articles

Runtime neural pruning

J Lin, Y Rao, J Lu, J Zhou - Advances in neural information processing systems, 2017