Lossless compression of neural network weights
Abstract:
A system and a method provide compression and decompression of weights of a layer of a neural network. For compression, the values of the weights are pruned and the weights of a layer are configured as a tensor having a tensor size of H×W×C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor. The tensor is formatted into at least one block of values. Each block is encoded independently from other blocks of the tensor using at least one lossless compression mode. For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of H×W×C.
Information query
Patent Agency Ranking
0/0