METHOD AND DEVICE FOR COMPRESSING NEURAL NETWORK

    公开(公告)号:US20220164671A1

    公开(公告)日:2022-05-26

    申请号:US17533082

    申请日:2021-11-22

    Abstract: A method for compressing a neural network includes: obtaining a neural network including a plurality of parameters to be compressed; dividing the parameters into J blocks; compressing a jth block with Kj compression ratios to generate Kj operation branches; obtaining Kj weighting factors; replacing the jth block with the Kj operation branches weighted by the Kj weighting factors to generate a replacement neural network; performing forward propagation to the replacement neural network, a weighted sum operation being performed on Kj operation results generated by the Kj operation branches with the Kj weighting factors and a result of the operation being used as an output; performing backward propagation to the replacement neural network, updated values of the Kj weighting factors being calculated based on a model loss; and determining an operation branch corresponding to the maximum value of the updated values of the Kj weighting factors as a compressed jth block.

    METHOD AND DEVICE FOR COMPRESSING NEURAL NETWORK

    公开(公告)号:US20220164665A1

    公开(公告)日:2022-05-26

    申请号:US17530486

    申请日:2021-11-19

    Abstract: A method for compressing a neural network includes: obtaining a neural network including J operation layers; compressing a jth operation layer with Kj compression ratios to generate Kj operation branches; obtaining Kj weighting factors; replacing the jth operation layer with the Kj operation branches weighted by the Kj weighting factors to generate a replacement neural network; performing forward propagation to the replacement neural network, a weighted sum operation being performed on Kj operation results generated by the Kj operation branches with the Kj weighting factors and a result of the weighted sum operation being used as an output of the jth operation layer; performing backward propagation to the replacement neural network, updated values of the Kj weighting factors being calculated based on a model loss; and determining an operation branch corresponding to the maximum value of the updated values of the Kj weighting factors as a compressed jth operation layer.

    METHOD AND DEVICE FOR PRUNING CONVOLUTIONAL LAYER IN NEURAL NETWORK

    公开(公告)号:US20210287092A1

    公开(公告)日:2021-09-16

    申请号:US17107973

    申请日:2020-12-01

    Abstract: The present application discloses a method and a device for pruning one or more convolutional layer in a neural network. The method includes: obtaining one target convolution layer from the one or more convolution layers in the neural network, the target convolution layer including C filters, each filter including K convolution kernels, and each convolution kernel including M rows and N columns of weight values, where C, K, M and N are positive integers greater than or equal to one; determining a number P of weight values to be pruned for each convolution kernel of the target convolution layer based on a number of weight values M×N in the convolution kernel and a target compression ratio, where P is a positive integer smaller than M×N; and setting P weight values with the smallest absolute values in each convolution kernel of the target convolution layer to zero to form a pruned convolution layer.

Patent Agency Ranking