Compact NUMA-aware Locks
    51.
    发明申请

    公开(公告)号:US20200097335A1

    公开(公告)日:2020-03-26

    申请号:US16573863

    申请日:2019-09-17

    Abstract: A computer comprising multiple processors and non-uniform memory implements multiple threads that perform a lock operation using a shared lock structure that includes a pointer to a tail of a first-in-first-out (FIFO) queue of threads waiting to acquire the lock. To acquire the lock, a thread allocates and appends a data structure to the FIFO queue. The lock is released by selecting and notifying a waiting thread to which control is transferred, with the thread selected executing on the same processor socket as the thread controlling the lock. A secondary queue of threads is managed for threads deferred during the selection process and maintained within the data structures of the waiting threads such that no memory is required within the lock structure. If no threads executing on the same processor socket are waiting for the lock, entries in the secondary queue are transferred to the FIFO queue preserving FIFO order.

    Ticket Locks with Enhanced Waiting
    52.
    发明申请

    公开(公告)号:US20200097287A1

    公开(公告)日:2020-03-26

    申请号:US16572532

    申请日:2019-09-16

    Abstract: A computer comprising one or more processors and memory may implement multiple threads that perform a lock operation using a data structure comprising an allocation field and a grant field. Upon entry to a lock operation, a thread allocates a ticket by atomically copying a ticket value contained in the allocation field and incrementing the allocation field. The thread compares the allocated ticket to the grant field. If they are unequal, the thread determines a number of waiting threads. If the number is above the threshold, the thread enters a long term wait operation comprising determining a location for long term wait value and waiting on changes to that value. If the number is below the threshold or the long term wait operation is complete, the thread waits for the grant value to equal the ticket to indicate that the lock is allocated.

    System and method for promoting reader groups for lock cohorting

    公开(公告)号:US10585719B2

    公开(公告)日:2020-03-10

    申请号:US16056094

    申请日:2018-08-06

    Abstract: NUMA-aware reader-writer locks may leverage lock cohorting techniques that introduce a synthetic level into the lock hierarchy (e.g., one whose nodes do not correspond to the system topology). The synthetic level may include a global reader lock and a global writer lock. A writer thread may acquire a node-level writer lock, then the global writer lock, and then the top-level lock, after which it may access a critical section protected by the lock. The writer may release the lock (if an upper bound on consecutive writers has been met), or may pass the lock to another writer (on the same node or a different node, according to a fairness policy). A reader may acquire the global reader lock (whether or not node-level reader locks are present), and then the top-level lock. However, readers may only hold these locks long enough to increment reader counts associated with them.

    Systems and Methods for Performing Concurrency Restriction and Throttling over Contended Locks

    公开(公告)号:US20200012538A1

    公开(公告)日:2020-01-09

    申请号:US16570952

    申请日:2019-09-13

    Inventor: David Dice

    Abstract: A concurrency-restricting lock may divide a set of threads waiting to acquire the lock into an active circulating set (ACS) that contends for the lock, and a passive set (PS) that awaits an opportunity to contend for the lock. The lock, which may include multiple constituent lock types, lists, or queues, may be unfair over the short term, but improve throughput of the underlying multithreaded application. Culling and long-term fairness policies may be applied to the lock to move excess threads from the ACS to the PS or promote threads from the PS to the ACS. These policies may constraint the size or distribution of threads in the ACS (which may be NUMA-aware). A waiting policy may avoid aggressive promotion from the PS to the ACS, and a short-term fairness policy may move a thread from the tail of a list or queue to its head.

    Systems and methods for safely subscribing to locks using hardware extensions

    公开(公告)号:US10521277B2

    公开(公告)日:2019-12-31

    申请号:US14736123

    申请日:2015-06-10

    Abstract: Transactional Lock Elision allows hardware transactions to execute unmodified critical sections protected by the same lock concurrently, by subscribing to the lock and verifying that it is available before committing the transaction. A “lazy subscription” optimization, which delays lock subscription, can potentially cause behavior that cannot occur when the critical sections are executed under the lock. Hardware extensions may provide mechanisms to ensure that lazy subscriptions are safe (e.g., that they result in correct behavior). Prior to executing a critical section transactionally, its lock and subscription code may be identified (e.g., by writing their locations to special registers). Prior to committing the transaction, the thread executing the critical section may verify that the correct lock was correctly subscribed to. If not, or if locations identified by the special registers have been modified, the transaction may be aborted. Nested critical sections associated with different lock types may invoke different subscription code.

    Permuted memory access mapping
    56.
    发明授权

    公开(公告)号:US10176109B2

    公开(公告)日:2019-01-08

    申请号:US15493035

    申请日:2017-04-20

    Abstract: When performing non-sequential accesses to large data sets, hot spots may be avoided by permuting the memory locations being accesses to more evenly spread those accesses across the memory and across multiple memory channels. A permutation step may be used when accessing data, such as to improve the distribution of those memory accesses within the system. Instead of accessing one memory address, that address may be permuted so that another memory address is accessed. Non-sequential accesses to an array may be modified such that each index to the array is permuted to another index in the array. Collisions between pre- and post-translation addresses may be prevented and one-to-one mappings may be used. Permutation mechanisms may be implemented in software, hardware, or a combination of both, with or without the knowledge of the process performing the memory accesses.

    System and method for promoting reader groups for lock cohorting

    公开(公告)号:US10042679B2

    公开(公告)日:2018-08-07

    申请号:US15012505

    申请日:2016-02-01

    Abstract: NUMA-aware reader-writer locks may leverage lock cohorting techniques that introduce a synthetic level into the lock hierarchy (e.g., one whose nodes do not correspond to the system topology). The synthetic level may include a global reader lock and a global writer lock. A writer thread may acquire a node-level writer lock, then the global writer lock, and then the top-level lock, after which it may access a critical section protected by the lock. The writer may release the lock (if an upper bound on consecutive writers has been met), or may pass the lock to another writer (on the same node or a different node, according to a fairness policy). A reader may acquire the global reader lock (whether or not node-level reader locks are present), and then the top-level lock. However, readers may only hold these locks long enough to increment reader counts associated with them.

    Hardware extensions for memory reclamation for concurrent data structures

    公开(公告)号:US09785548B2

    公开(公告)日:2017-10-10

    申请号:US14946625

    申请日:2015-11-19

    Abstract: A hardware-assisted mechanism may improve the performance of memory reclamation operations that employ hazard pointers. The mechanism includes hazard lookaside buffers (HLBs), each implemented in hardware and locally accessible to one or more processor cores, and two new instructions. A special store instruction may write entries to local HLBs for pointers that have been or will be dereferenced but were not yet written to a shared hazard table (which requires memory barriers). Each entry may include a hazard pointer and a table address. A special test instruction may signal each HLB to determine whether it contains a particular pointer and, if so, to return a response. If the pointer does not reside in any HLB, the memory reclamation operation may search the hazard table for the pointer. If the pointer is found in an HLB or in the hazard table, the pointed-to memory location or memory block is not reclaimed.

    Transactional execution of native methods

    公开(公告)号:US09740597B2

    公开(公告)日:2017-08-22

    申请号:US14641828

    申请日:2015-03-09

    Abstract: Approaches for more efficiently executing calls to native code from within a managed execution environment are described. The techniques involve attempting to execute a native call, such as a call to a C function from within Java code, using a single hardware transaction. Not only is the native code executed in a hardware transaction, but also various transitional operations needed for transitioning between managed execution mode and native execution mode. If the hardware transaction is successful, at least some of the operations that would normally be performed during transitions between modes may be omitted or simplified. If the hardware transaction is unsuccessful, the native calls may be performed as they normally would, outside of hardware transactions.

    Hardware Extensions for Memory Reclamation for Concurrent Data Structures

    公开(公告)号:US20170147487A1

    公开(公告)日:2017-05-25

    申请号:US14946625

    申请日:2015-11-19

    Abstract: A hardware-assisted mechanism may improve the performance of memory reclamation operations that employ hazard pointers. The mechanism includes hazard lookaside buffers (HLBs), each implemented in hardware and locally accessible to one or more processor cores, and two new instructions. A special store instruction may write entries to local HLBs for pointers that have been or will be dereferenced but were not yet written to a shared hazard table (which requires memory barriers). Each entry may include a hazard pointer and a table address. A special test instruction may signal each HLB to determine whether it contains a particular pointer and, if so, to return a response. If the pointer does not reside in any HLB, the memory reclamation operation may search the hazard table for the pointer. If the pointer is found in an HLB or in the hazard table, the pointed-to memory location or memory block is not reclaimed.

Patent Agency Ranking