-
公开(公告)号:US11263175B2
公开(公告)日:2022-03-01
申请号:US17039584
申请日:2020-09-30
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Gordon Lyle Keener , Richard Todd Knight
Abstract: An apparatus includes a processor to: within each reading thread, retrieve a data set part and corresponding part metadata from storage device(s), analyze row group metadata for each row group within the data set part to identify candidate row group(s) meeting specified criteria, and store the candidate row group(s) and corresponding row group metadata within a data buffer of a queue; operate the queue as a FIFO buffer; within each provision thread, retrieve one of multiple row groups and corresponding metadata from within the data buffer, use information in the metadata to identify rows meeting the criteria, and provide those rows to the requesting device or an application; and in response to each instance of storage of a data set part within a data buffer of the queue, analyze the availability of storage space and/or of processing resources to determine whether to dynamically adjust the quantity of reading threads.
-
公开(公告)号:US09811524B2
公开(公告)日:2017-11-07
申请号:US15220182
申请日:2016-07-26
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Steven E. Krueger , Richard Todd Knight , Chih-Wei Ho
CPC classification number: G06F17/30097 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F17/302 , G06F17/30312 , G06F17/30584 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263
Abstract: An apparatus comprising a processor component to: provide, to a control device, an indication of availability to perform a processing task with one or more data set portions as a node device; perform a processing task specified by the control device with the one or more data set portions; and request a pointer to a location at which to store the one or more data set portions as a data block within a data file. In response to the data set including partitioned data, for each data set portion, include a data sub-block size of the data set portion and a hashed identifier derived from a partition label of a partition in the request; receive, from the control device, the requested pointer to the location; and store each data set portion as a data sub-block within the data block starting at the location within the data file.
-
公开(公告)号:US20210026806A1
公开(公告)日:2021-01-28
申请号:US17039584
申请日:2020-09-30
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Gordon Lyle Keener , Richard Todd Knight
Abstract: An apparatus includes a processor to: within each reading thread, retrieve a data set part and corresponding part metadata from storage device(s), analyze row group metadata for each row group within the data set part to identify candidate row group(s) meeting specified criteria, and store the candidate row group(s) and corresponding row group metadata within a data buffer of a queue; operate the queue as a FIFO buffer; within each provision thread, retrieve one of multiple row groups and corresponding metadata from within the data buffer, use information in the metadata to identify rows meeting the criteria, and provide those rows to the requesting device or an application; and in response to each instance of storage of a data set part within a data buffer of the queue, analyze the availability of storage space and/or of processing resources to determine whether to dynamically adjust the quantity of reading threads.
-
公开(公告)号:US09619148B2
公开(公告)日:2017-04-11
申请号:US15220034
申请日:2016-07-26
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Steven E. Krueger , Richard Todd Knight , Chih-Wei Ho
CPC classification number: G06F17/30097 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F17/302 , G06F17/30312 , G06F17/30584 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263
Abstract: An apparatus includes processor component caused to: retrieve metadata of organization of data within a data set, and map data of organization of data blocks within a data file; receive indications of which node devices are available to perform a processing task with a data set portion; and in response to the data set including partitioned data, compare the quantities of available node devices and of the node devices last involved in storing the data set. In response to a match, for each map data map entry: retrieve a hashed identifier for a data sub-block, and a size for each of the data sub-blocks within the corresponding data block; divide the hashed identifier by the quantity of available node devices; compare the modulo value to a designation assigned to each of the available node devices; and provide a pointer to the available node device assigned the matching designation.
-
公开(公告)号:US10185721B2
公开(公告)日:2019-01-22
申请号:US15804570
申请日:2017-11-06
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Steven E. Krueger , Richard Todd Knight , Chih-Wei Ho
Abstract: An apparatus includes a processor component caused to: retrieve metadata of organization of data within a data set, and map data of organization of data blocks within a data file; receive indications of which node devices are available to perform a processing task with a data set portion; and in response to the data set including partitioned data, compare the quantities of available node devices and of the node devices last involved in storing the data set. In response to a match, for each map data map entry: retrieve a hashed identifier for a data sub-block, and a size for each of the data sub-blocks within the corresponding data block; divide the hashed identifier by the quantity of available node devices; compare the modulo value to a designation assigned to each of the available node devices; and provide a pointer to the available node device assigned the matching designation.
-
公开(公告)号:US09703789B2
公开(公告)日:2017-07-11
申请号:US15220192
申请日:2016-07-26
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Steven E. Krueger , Richard Todd Knight , Chih-Wei Ho
CPC classification number: G06F17/30097 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F17/302 , G06F17/30312 , G06F17/30584 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263
Abstract: An apparatus comprising a processor component to: receive metadata of data organization within a data set; receive indications of which node devices will be storing the data set as multiple data blocks within a data file; and receive, from each node device, a pointer request to a location within the data file for storing a data set portion as a data block. In response to the data set including partitioned data, for each request for a pointer: determine the location within the data file; generate a map data map entry for the data block; generate therein a sub-block count of data sub-blocks within the data block; generate therein a sub-entry for each data sub-block including size and a hashed identifier derived from a partition label; and provide a pointer to the node device. In response to successful storage of all data blocks, store the map data in the data file.
-
-
-
-
-