-
公开(公告)号:US11263175B2
公开(公告)日:2022-03-01
申请号:US17039584
申请日:2020-09-30
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Gordon Lyle Keener , Richard Todd Knight
Abstract: An apparatus includes a processor to: within each reading thread, retrieve a data set part and corresponding part metadata from storage device(s), analyze row group metadata for each row group within the data set part to identify candidate row group(s) meeting specified criteria, and store the candidate row group(s) and corresponding row group metadata within a data buffer of a queue; operate the queue as a FIFO buffer; within each provision thread, retrieve one of multiple row groups and corresponding metadata from within the data buffer, use information in the metadata to identify rows meeting the criteria, and provide those rows to the requesting device or an application; and in response to each instance of storage of a data set part within a data buffer of the queue, analyze the availability of storage space and/or of processing resources to determine whether to dynamically adjust the quantity of reading threads.
-
公开(公告)号:US10983957B2
公开(公告)日:2021-04-20
申请号:US17037652
申请日:2020-09-29
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman
IPC: G06F12/00 , G06F16/13 , G06F21/60 , G06F16/22 , G06F16/27 , G06F3/06 , G06F9/50 , G06F12/02 , G06F16/182 , G06F13/00 , G06F13/28
Abstract: An apparatus includes a processor to: instantiate collection threads, data buffers of a queue, and aggregation threads; within each collection thread, assemble a row group from a subset of the multiple rows, reorganize the data values row-wise to columnar organization, and store the row group within a data buffer of the queue; operate the buffer queue as a FIFO buffer; within each aggregation thread, retrieve multiple row groups from multiple data buffers of the queue, assemble a data set part from the multiple row groups, transmit, to storage device(s) via a network, the data set part; and in response to each instance of retrieval of a row group from a data buffer of the buffer queue for use within an aggregation thread, analyze a level of availability of at least storage space within the node device to determine whether to dynamically adjust the quantity of data buffers of the buffer queue.
-
公开(公告)号:US10789207B2
公开(公告)日:2020-09-29
申请号:US16233644
申请日:2018-12-27
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Jeff Ira Cleveland, III
IPC: G06F12/00 , G06F16/13 , G06F3/06 , G06F21/60 , G06F16/22 , G06F16/27 , G06F9/50 , G06F12/02 , G06F16/182 , G06F13/00 , G06F13/28
Abstract: An apparatus includes a processor component to: transmit node device identifiers to multiple node devices to define an ordering thereamong; following block exchanges redistributing the subsets among a reduced number of node devices, receive sizes of blocks or sub-blocks of data within each subset from the reduced number of node devices; based on the received sizes, generate map data organized to define an ordering among the blocks stemming from the ordering among the multiple node devices; determine whether the total size of the map data and metadata, together, exceeds a minimum size for data transmissions to storage device(s); and in response to the total size exceeding the minimum size, form the map data and metadata into segment(s) that each fit the minimum size and a maximum size, and transmit the segment(s) at least partially in parallel with other segments of the blocks transmitted by the reduced number of node devices.
-
公开(公告)号:US20210026805A1
公开(公告)日:2021-01-28
申请号:US17039314
申请日:2020-09-30
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Gordon Lyle Keener
Abstract: An apparatus includes a processor to: instantiate data buffers of a queue, reading threads, and provision threads; within each reading thread, use an identifier provided in a data buffer of the queue to retrieve the corresponding data set part and part metadata from storage device(s), and store both within the data buffer; operate the queue as a (FIFO) buffer; within each provision thread, retrieve a row group from among multiple row groups and corresponding metadata from within the data buffer, use information in the metadata to decompress at least one column, and provide the data values of the row group to the requesting device or an application routine; and in response to each instance of storage of a data set part within a data buffer of the queue, analyze the availability of storage space and/or of processing resources to determine whether to dynamically adjust the quantity of reading threads.
-
公开(公告)号:US10185721B2
公开(公告)日:2019-01-22
申请号:US15804570
申请日:2017-11-06
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Steven E. Krueger , Richard Todd Knight , Chih-Wei Ho
Abstract: An apparatus includes a processor component caused to: retrieve metadata of organization of data within a data set, and map data of organization of data blocks within a data file; receive indications of which node devices are available to perform a processing task with a data set portion; and in response to the data set including partitioned data, compare the quantities of available node devices and of the node devices last involved in storing the data set. In response to a match, for each map data map entry: retrieve a hashed identifier for a data sub-block, and a size for each of the data sub-blocks within the corresponding data block; divide the hashed identifier by the quantity of available node devices; compare the modulo value to a designation assigned to each of the available node devices; and provide a pointer to the available node device assigned the matching designation.
-
公开(公告)号:US09977807B1
公开(公告)日:2018-05-22
申请号:US15838211
申请日:2017-12-11
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Gordon Lyle Keener , Steven E. Krueger
CPC classification number: G06F17/30327 , G06F7/02 , G06F7/08 , G06F7/20 , G06F9/5027 , G06F9/5072 , G06F17/30289 , G06F17/30321 , G06F17/3033 , G06F17/30345 , G06F17/30371 , G06F17/30424 , G06F17/30575 , G06F17/30725 , G06F17/30949 , G06F17/30961
Abstract: An apparatus including a processor to: receive search criteria including a data value; in response to receiving the search criteria, generate a hash value from the data value of the search criteria, and for each data cell of a super cell, compare the hash value to hash values within a hash values vector in the corresponding cell index to determine whether the data cell includes at least one data record meeting the search criteria, and in response to determining that the data cell includes at least one of such data record, search the data records to identify one or more data records meeting the search criteria; and in response to identifying at least one data record within at least one data cell of the super cell meeting the search criteria, provide results data indicative of the super cell including at least one of such data record.
-
公开(公告)号:US09946719B2
公开(公告)日:2018-04-17
申请号:US15694674
申请日:2017-09-01
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Mark Kuebler Gass, III
CPC classification number: G06F17/30097 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F17/302 , G06F17/30312 , G06F17/30584 , G06F21/602 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263
Abstract: An apparatus includes a processor component of a first node device caused to receive data block encryption data and an indication of size of an encrypted data block distributed to the first node device for decryption, and in response to the data set being of encrypted data: receive an indication of the quantity of sub-blocks within the encrypted data block, and a hashed identifier for each data sub-block; use the data block encryption data to decrypt the encrypted data block to regenerate data set portions from the data sub-blocks; analyze the hashed identifier of each data sub-block to determine whether all data set portions are distributed to the first node device for processing; and in response to a determination that at least one data set portion is to be distributed to a second node device for processing, transmit the at least one data set portion to the second node device.
-
公开(公告)号:US20180011866A1
公开(公告)日:2018-01-11
申请号:US15694662
申请日:2017-09-01
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Mark Kuebler Gass, III
CPC classification number: G06F17/30097 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F17/302 , G06F17/30312 , G06F17/30584 , G06F21/602 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263
Abstract: An apparatus may include a processor component caused to: generate map entries in map data descriptive of encrypted data blocks within a data file; use first map block encryption data to encrypt a first map extension of the map data; transmit the encrypted first map extension for storage within the data file; store the first map block encryption data within the second map extension; use second map block encryption data to encrypt a second map extension of the map data after storage of the first map block encryption data therein; transmit encrypted second map extension for storage within the data file; store the second map block encryption data within the map base; use third map block encryption data to encrypt a map base of the map data after storage of the second map block encryption data therein; and transmit the encrypted map base for storage within the data file.
-
公开(公告)号:US09703789B2
公开(公告)日:2017-07-11
申请号:US15220192
申请日:2016-07-26
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Steven E. Krueger , Richard Todd Knight , Chih-Wei Ho
CPC classification number: G06F17/30097 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F17/302 , G06F17/30312 , G06F17/30584 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263
Abstract: An apparatus comprising a processor component to: receive metadata of data organization within a data set; receive indications of which node devices will be storing the data set as multiple data blocks within a data file; and receive, from each node device, a pointer request to a location within the data file for storing a data set portion as a data block. In response to the data set including partitioned data, for each request for a pointer: determine the location within the data file; generate a map data map entry for the data block; generate therein a sub-block count of data sub-blocks within the data block; generate therein a sub-entry for each data sub-block including size and a hashed identifier derived from a partition label; and provide a pointer to the node device. In response to successful storage of all data blocks, store the map data in the data file.
-
公开(公告)号:US20190197021A1
公开(公告)日:2019-06-27
申请号:US16233573
申请日:2018-12-27
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Jeff Ira Cleveland, III
CPC classification number: G06F16/137 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F16/1827 , G06F16/22 , G06F16/278 , G06F21/602 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263 , H05K999/99
Abstract: An apparatus of includes a processor component to: transmit node device identifiers to multiple node devices to define an ordering thereamong and among subsets of multiple blocks of data distributed thereamong; receive sizes of the subsets from the multiple node devices; derive block exchanges among the multiple node device based on the sizes and a minimum size imposed on data transmissions to storage device(s); and transmit a block exchange vector that describes the block exchanges to the multiple node devices, wherein: the subsets remain distributed among a reduced number of the multiple node devices following the block exchanges; at least all node devices of the reduced number but one stores an amount of the blocks of data exceeding the minimum size; and the block exchanges are all lower-order to higher-order node device transfers, or all higher-order to lower-order node device transfers.
-
-
-
-
-
-
-
-
-