Partition-based index management in hadoop-like data stores
Abstract:
A method for processing a dataset in a partitioned distributed storage system having data stored in a base table and an index stored in an index table, may include receiving base and index table metadata from the partitioned distributed storage system, where the base and index table metadata includes respective table partition information. The method may further include partitioning the dataset into a set of base-delta files according to the base table metadata, and generating a set of index-delta files corresponding with the base-delta files according to the index table metadata. The method may additionally include updating the partitioned distributed storage system with the set of base-delta and the set of index-delta files, where a first update of the base table is synchronous with a second update of the index table.
Public/Granted literature
Information query
Patent Agency Ranking
0/0