Authors
Ji Lin, Yongming Rao, Jiwen Lu, Jie Zhou
Publication date
2017
Journal
Advances in neural information processing systems
Volume
30
Description
In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime. Unlike existing neural pruning methods which produce a fixed pruned model for deployment, our method preserves the full ability of the original network and conducts pruning according to the input image and current feature maps adaptively. The pruning is performed in a bottom-up, layer-by-layer manner, which we model as a Markov decision process and use reinforcement learning for training. The agent judges the importance of each convolutional kernel and conducts channel-wise pruning conditioned on different samples, where the network is pruned more when the image is easier for the task. Since the ability of network is fully preserved, the balance point is easily adjustable according to the available resources. Our method can be applied to off-the-shelf network structures and reach a better tradeoff between speed and accuracy, especially with a large pruning rate.
Total citations
201820192020202120222023202424721021101208951
Scholar articles
J Lin, Y Rao, J Lu, J Zhou - Advances in neural information processing systems, 2017