-
公开(公告)号:WO2017019794A1
公开(公告)日:2017-02-02
申请号:PCT/US2016/044309
申请日:2016-07-27
Applicant: SAS INSTITUTE INC.
Inventor: BOWMAN, Brian Payton , KRUEGER, Steven E. , KNIGHT, Richard Todd , HO, Chih-Wei
CPC classification number: G06N5/04 , G06F3/0607 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06N5/02
Abstract: An apparatus includes a processor component caused to: retrieve metadata of organization of data within a data set, and map data of organization of data blocks within a data file; receive indications of which node devices are available to perform a processing task with a data set portion; and in response to the data set including partitioned data, compare the quantities of available node devices and of the node devices last involved in storing the data set. In response to a match, for cacti map data map entry: retrieve a hashed identifier for a data sub-block, and a size for each of the data sub-blocks within the corresponding data block; divide the hashed identifier by the quantity of available node devices; compare the modulo value to a designation assigned to each of the available node devices; and provide a pointer to the available node device assigned the matching designation.
Abstract translation: 一种装置包括:处理器组件,用于:检索数据集内的数据组织的元数据,以及映射数据文件内的数据块的组织数据; 接收哪些节点设备可用于使用数据组部分执行处理任务的指示; 并且响应于包括分割数据的数据集,比较存储数据集的最后涉及的可用节点设备和节点设备的数量。 响应于匹配,对于仙人掌映射数据映射条目:检索用于数据子块的散列标识符以及相应数据块内的每个数据子块的大小; 将哈希标识符除以可用节点设备的数量; 将模值与分配给每个可用节点设备的指定进行比较; 并提供指向分配了匹配名称的可用节点设备的指针。
-
公开(公告)号:WO2021101798A
公开(公告)日:2021-05-27
申请号:PCT/US2020/060379
申请日:2020-11-13
Applicant: SAS INSTITUTE INC. [US]/[US]
Inventor: BOWMAN, Brian Payton , KEENER, Gordon Lyle , KNIGHT, Richard Todd
IPC: G06F12/02 , G06F16/22 , G06F16/182 , G06F16/13
Abstract: An apparatus includes a processor to: instantiate collection threads, data buffers of a queue, and aggregation threads: within each collection thread, assemble a row group from a subset of the multiple rows, reorganize the data values row-wise to columnar organization, and store the row group within a data buffer of the queue; operate the buffer queue as a FIFO buffer; within each aggregation thread, retrieve multiple row groups from multiple data buffers of the queue, assemble a data set part from the multiple row groups, transmit, to storage device(s) via a network, the data set part; and in response to each instance of retrieval of a row group from a data buffer of the buffer queue for use within an aggregation thread, analyze a level of availability of at least storage space within the node device to determine whether to dynamically adjust the quantity of data buffers of the buffer queue.
-
公开(公告)号:EP4062289A1
公开(公告)日:2022-09-28
申请号:EP20891182.6
申请日:2020-11-13
Applicant: SAS Institute Inc.
Inventor: BOWMAN, Brian Payton , KEENER, Gordon Lyle , KNIGHT, Richard Todd
IPC: G06F12/02 , G06F16/22 , G06F16/182 , G06F16/13
-
-