-
公开(公告)号:US20220164671A1
公开(公告)日:2022-05-26
申请号:US17533082
申请日:2021-11-22
Applicant: MONTAGE TECHNOLOGY CO., LTD.
Inventor: Zhen DONG , Yuanfei NIE , Huan FENG
IPC: G06N3/08
Abstract: A method for compressing a neural network includes: obtaining a neural network including a plurality of parameters to be compressed; dividing the parameters into J blocks; compressing a jth block with Kj compression ratios to generate Kj operation branches; obtaining Kj weighting factors; replacing the jth block with the Kj operation branches weighted by the Kj weighting factors to generate a replacement neural network; performing forward propagation to the replacement neural network, a weighted sum operation being performed on Kj operation results generated by the Kj operation branches with the Kj weighting factors and a result of the operation being used as an output; performing backward propagation to the replacement neural network, updated values of the Kj weighting factors being calculated based on a model loss; and determining an operation branch corresponding to the maximum value of the updated values of the Kj weighting factors as a compressed jth block.
-
公开(公告)号:US20220164665A1
公开(公告)日:2022-05-26
申请号:US17530486
申请日:2021-11-19
Applicant: MONTAGE TECHNOLOGY CO., LTD.
Inventor: Zhen DONG , Yuanfei NIE , Huan FENG
Abstract: A method for compressing a neural network includes: obtaining a neural network including J operation layers; compressing a jth operation layer with Kj compression ratios to generate Kj operation branches; obtaining Kj weighting factors; replacing the jth operation layer with the Kj operation branches weighted by the Kj weighting factors to generate a replacement neural network; performing forward propagation to the replacement neural network, a weighted sum operation being performed on Kj operation results generated by the Kj operation branches with the Kj weighting factors and a result of the weighted sum operation being used as an output of the jth operation layer; performing backward propagation to the replacement neural network, updated values of the Kj weighting factors being calculated based on a model loss; and determining an operation branch corresponding to the maximum value of the updated values of the Kj weighting factors as a compressed jth operation layer.
-
公开(公告)号:US20210287092A1
公开(公告)日:2021-09-16
申请号:US17107973
申请日:2020-12-01
Applicant: MONTAGE TECHNOLOGY CO., LTD.
Inventor: Yuanfei NIE , Zhen DONG , Huan FENG
Abstract: The present application discloses a method and a device for pruning one or more convolutional layer in a neural network. The method includes: obtaining one target convolution layer from the one or more convolution layers in the neural network, the target convolution layer including C filters, each filter including K convolution kernels, and each convolution kernel including M rows and N columns of weight values, where C, K, M and N are positive integers greater than or equal to one; determining a number P of weight values to be pruned for each convolution kernel of the target convolution layer based on a number of weight values M×N in the convolution kernel and a target compression ratio, where P is a positive integer smaller than M×N; and setting P weight values with the smallest absolute values in each convolution kernel of the target convolution layer to zero to form a pruned convolution layer.
-
-