Variable-sized symbol entropy-based data compression
Abstract:
Methods, devices and systems for data compression and decompression are disclosed. A collection of data is obtained. The collection of data is sampled to establish, for a plurality of different symbol sizes, relative frequencies of symbols of the respective sizes in the collection of data. A code is generated to contain variable-length codewords by entropy encoding sampled symbols in the collection of data based on a metric which reflects the relative frequencies of the sampled symbols as well as their sizes. Symbols in the collection of data are compressed into compressed representations using the generated code, wherein the compressed representation of a symbol comprises a codeword which represents the symbol as well as metadata for decompressing the compressed representation.
Public/Granted literature
Information query
Patent Agency Ranking
0/0