Invention Grant
- Patent Title: Columnar database compression
-
Application No.: US14942772Application Date: 2015-11-16
-
Publication No.: US10169361B2Publication Date: 2019-01-01
- Inventor: Sami Abed , Pedro M Barbas , Austin Clifford , Konrad Emanowicz
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Edward J. Wixted, III
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06N99/00

Abstract:
Disclosed is a computer-implemented method of compressing data in a columnar database comprising at least one column partitioned into a plurality of partitions including at least one empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values relevant to the recurrence frequency of the data entry in the partition, the data entries being compressed in accordance with a compression dictionary based on the respective recurrence frequencies of the data entries in the filled partition. The computer-implemented method comprises receiving forecasted parameter values for the set of parameters for an expected set of data entries to be stored in an empty partition of the column; predicting a recurrence frequency of the data entries in the expected set using the forecasted parameter values by evaluating the respective compression dictionaries of the filled partitions with a machine learning algorithm; generating a predictive compression dictionary for the expected set of data entries based on the predicted recurrence frequency of the data entries in the expected set; receiving the expected set of data entries; and compressing at least part of the received expected set of data entries using the predictive compression dictionary. A computer program product and a computer system for implementing such a method are also disclosed.
Public/Granted literature
- US20170139947A1 COLUMNAR DATABASE COMPRESSION Public/Granted day:2017-05-18
Information query