Abstract:
Methods, apparatuses and storage device associated with cache and/or socket sensitive breadth-first iterative traversal of a graph by parallel threads, are described. A vertices visited array (VIS) may be employed to track graph vertices visited. VIS may be partitioned into VIS sub-arrays, taking into consideration cache sizes of LLC, to reduce likelihood of evictions. Potential boundary vertices arrays (PBV) may be employed to store potential boundary vertices for a next iteration, for vertices being visited in a current iteration. The number of PBV generated for each thread may take into consideration a number of sockets, over which the processor cores employed are distributed. The threads may be load balanced; further data locality awareness to reduce inter-socket communication may be considered, and/or lock-and-atomic free update operations may be employed.
Abstract:
Embodiments of techniques and systems for parallel processing of B+ trees are described. A parallel B+ tree processing module with partitioning and redistribution may include a set of threads executing a batch of B+ tree operations on a B+ tree in parallel. The batch of operations may be partitioned amongst the threads. Next, a search may be performed to determine which leaf nodes in the B+ tree are to be affected by which operations. Then, the threads may redistribute operations between each other such that multiple threads will not operate on the same leaf node. The threads may then perform B+ tree operations on the leaf nodes of the B+ tree in parallel. Subsequent modifications to nodes in the B+ may similarly be redistributed and performed in parallel as the threads work up the tree.