System and method for compression and decompression of text data
Abstract:
The present disclosure relates to system(s) and method(s) for compression and decompression of Unicode characters. The system is configured to maintain a set of character tables and a cluster table in a memory. Each character table is configured to store a set of Unicode characters corresponding to a character class of a set of characters classes, wherein each Unicode character from the character table is assigned with a shortened bit representation. Furthermore, the cluster table may be configured to maintain a set of cluster types and a cluster identifier corresponding to each of the cluster type. The system is configured to compress text data in Unicode format using the set of character tables and the cluster table by identifying the different clusters in each word and replacing the clusters with cluster identifier followed by the shorten bit representation of characters in each cluster.
Public/Granted literature
Information query
Patent Agency Ranking
0/0