Providing global metadata in a cluster computing environment
Abstract:
First and second data partitions that include first and second portions of data, respectively, from a first of a plurality of data streams are received. A first storage location of a distributed storage system, a first set of metadata for the first of the plurality of data streams is stored. A first and second digest is created for the data partitions, wherein each of the digests include a data structure that points to the first storage location. The data partitions including the digests are transmitted to one or more nodes of a cluster computing environment, wherein the one or more nodes are capable of accessing the first storage location via the data structure that points to the first storage location, and wherein the accessing of the first storage location provides processing information. The data partitions are processed using the processing information.
Public/Granted literature
Information query
Patent Agency Ranking
0/0