Generic concurrency restriction
    11.
    发明授权

    公开(公告)号:US10565024B2

    公开(公告)日:2020-02-18

    申请号:US15298090

    申请日:2016-10-19

    Abstract: Generic Concurrency Restriction (GCR) may divide a set of threads waiting to acquire a lock into two sets: an active set currently able to contend for the lock, and a passive set waiting for an opportunity to join the active set and contend for the lock. The number of threads in the active set may be limited to a predefined maximum or even a single thread. Generic Concurrency Restriction may be implemented as a wrapper around an existing lock implementation. Generic Concurrency Restriction may, in some embodiments, be unfair (e.g., to some threads) over the short term, but may improve the overall throughput of the underlying multithreaded application via passivation of a portion of the waiting threads.

    Hardware Transactional Memory-Assisted Flat Combining
    12.
    发明申请
    Hardware Transactional Memory-Assisted Flat Combining 审中-公开
    硬件交易记忆辅助平面组合

    公开(公告)号:US20160335117A1

    公开(公告)日:2016-11-17

    申请号:US15154686

    申请日:2016-05-13

    CPC classification number: G06F9/467

    Abstract: An HTM-assisted Combining Framework (HCF) may enable multiple (combiner and non-combiner) threads to access a shared data structure concurrently using hardware transactional memory (HTM). As long as a combiner executes in a hardware transaction and ensures that the lock associated with the data structure is available, it may execute concurrently with other threads operating on the data structure. HCF may include attempting to apply operations to a concurrent data structure utilizing HTM and if the HTM attempt fails, utilizing flat combining within HTM transactions. Publication lists may be used to announce operations to be applied to a concurrent data structure. A combiner thread may select a subset of the operations in the publication list and attempt to apply the selected operations using HTM. If the thread fails in these HTM attempts, it may acquire a lock associated with the data structure and apply the selected operations without HTM.

    Abstract translation: HTM辅助组合框架(HCF)可以使多个(组合器和非组合器)线程使用硬件事务存储器(HTM)同时访问共享数据结构。 只要组合器在硬件事务中执行并确保与数据结构相关联的锁可用,则可以与在数据结构上运行的其他线程同时执行。 HCF可能包括尝试将操作应用于利用HTM的并发数据结构,并且如果HTM尝试失败,则在HTM事务中使用平面组合。 出版物列表可以用于宣布要应用于并发数据结构的操作。 组合器线程可以选择发布列表中的操作的子集,并尝试使用HTM应用所选择的操作。 如果在这些HTM尝试中线程失败,它可以获取与数据结构相关联的锁,并且在没有HTM的情况下应用所选择的操作。

    Systems and methods for adaptive integration of hardware and software lock elision techniques
    13.
    发明授权
    Systems and methods for adaptive integration of hardware and software lock elision techniques 有权
    硬件和软件锁定技术自适应集成的系统和方法

    公开(公告)号:US09183043B2

    公开(公告)日:2015-11-10

    申请号:US14254758

    申请日:2014-04-16

    Abstract: Particular techniques for improving the scalability of concurrent programs (e.g., lock-based applications) may be effective in some environments and for some workloads, but not others. The systems described herein may automatically choose appropriate ones of these techniques to apply when executing lock-based applications at runtime, based on observations of the application in the current environment and with the current workload. In one example, two techniques for improving lock scalability (e.g., transactional lock elision using hardware transactional memory, and optimistic software techniques) may be integrated together. A lightweight runtime library built for this purpose may adapt its approach to managing concurrency by dynamically selecting one or more of these techniques (at different times) during execution of a given application. In this Adaptive Lock Elision approach, the techniques may be selected (based on pluggable policies) at runtime to achieve good performance on different platforms and for different workloads.

    Abstract translation: 用于提高并发程序(例如基于锁的应用程序)的可扩展性的特殊技术在一些环境中以及对于一些工作负载而言可能是有效的,而不是其他工作负载。 基于当前环境中的应用和当前工作负载的观察,本文所述的系统可以自动选择在运行时执行基于锁的应用时应用的这些技术中适当的系统。 在一个示例中,可以集成两种用于提高锁可伸缩性的技术(例如,使用硬件事务存储器的事务锁定检测和乐观软件技术)。 为此目的而构建的轻量级运行时库可以通过在执行给定应用程序期间动态选择这些技术(在不同时间)中的一个或多个技术来调整其方法来管理并发性。 在这种自适应锁定Elision方法中,可以在运行时选择(基于可插拔策略)的技术,以在不同的平台和不同的工作负载下实现良好的性能。

    Systems and Methods for Adaptive Integration of Hardware and Software Lock Elision Techniques
    14.
    发明申请
    Systems and Methods for Adaptive Integration of Hardware and Software Lock Elision Techniques 有权
    硬件和软件锁定Elision技术的自适应集成系统和方法

    公开(公告)号:US20150026688A1

    公开(公告)日:2015-01-22

    申请号:US14254758

    申请日:2014-04-16

    Abstract: Particular techniques for improving the scalability of concurrent programs (e.g., lock-based applications) may be effective in some environments and for some workloads, but not others. The systems described herein may automatically choose appropriate ones of these techniques to apply when executing lock-based applications at runtime, based on observations of the application in the current environment and with the current workload. In one example, two techniques for improving lock scalability (e.g., transactional lock elision using hardware transactional memory, and optimistic software techniques) may be integrated together. A lightweight runtime library built for this purpose may adapt its approach to managing concurrency by dynamically selecting one or more of these techniques (at different times) during execution of a given application. In this Adaptive Lock Elision approach, the techniques may be selected (based on pluggable policies) at runtime to achieve good performance on different platforms and for different workloads.

    Abstract translation: 用于提高并发程序(例如基于锁的应用程序)的可扩展性的特殊技术在一些环境中以及对于一些工作负载而言可能是有效的,而不是其他工作负载。 基于当前环境中的应用和当前工作负载的观察,本文所述的系统可以自动选择在运行时执行基于锁的应用时应用的这些技术中适当的系统。 在一个示例中,可以集成两种用于提高锁可伸缩性的技术(例如,使用硬件事务存储器的事务锁定检测和乐观软件技术)。 为此目的而构建的轻量级运行时库可以通过在执行给定应用程序期间动态选择这些技术(在不同时间)中的一个或多个技术来调整其方法来管理并发性。 在这种自适应锁定Elision方法中,可以在运行时选择(基于可插拔策略)的技术,以在不同的平台和不同的工作负载下实现良好的性能。

    Hardware transactional memory-assisted flat combining

    公开(公告)号:US12217083B2

    公开(公告)日:2025-02-04

    申请号:US17308502

    申请日:2021-05-05

    Abstract: An HTM-assisted Combining Framework (HCF) may enable multiple (combiner and non-combiner) threads to access a shared data structure concurrently using hardware transactional memory (HTM). As long as a combiner executes in a hardware transaction and ensures that the lock associated with the data structure is available, it may execute concurrently with other threads operating on the data structure. HCF may include attempting to apply operations to a concurrent data structure utilizing HTM and if the HTM attempt fails, utilizing flat combining within HTM transactions. Publication lists may be used to announce operations to be applied to a concurrent data structure. A combiner thread may select a subset of the operations in the publication list and attempt to apply the selected operations using HTM. If the thread fails in these HTM attempts, it may acquire a lock associated with the data structure and apply the selected operations without HTM.

    Compact synchronization in managed runtimes

    公开(公告)号:US12045670B2

    公开(公告)日:2024-07-23

    申请号:US17245820

    申请日:2021-04-30

    CPC classification number: G06F9/526 G06F9/30087 G06F9/5016 G06F9/541 G06F9/542

    Abstract: A computer including multiple processors and memory implements a managed runtime providing a synchronization application programming interface (API) for threads that perform synchronized accesses to shared objects. A standardized header of objects includes a memory word storing an object identifier. To lock the object for synchronized access, the memory word may be converted to store the tail of a linked list of a first-in-first-out synchronization structures for threads waiting to acquire the lock, with the object identifier relocated to the list structure. The list structure may further include a stack of threads waiting on events related to the object, with the synchronization API additionally providing wait, notify and related synchronization operations. Upon determining that no threads hold or desire to hold the lock for the object and that no threads are waiting on events related to the object, the memory word may be restored to contain the object identifier.

    Inference Performance Using Divide-and-Conquer Techniques

    公开(公告)号:US20240062104A1

    公开(公告)日:2024-02-22

    申请号:US18176342

    申请日:2023-02-28

    Inventor: Alex Kogan

    CPC classification number: G06N20/00

    Abstract: Systems, computer instructions encoded on non-transitory computer-accessible storage media and computer-implemented methods are disclosed for improving the inference performance of machine learning systems using a divide-and-conquer technique. An application configured to perform inferences using a trained machine learning model may be evaluated to identify opportunities to execute the portions of the application in parallel. The application may then be divided into multiple independently executable tasks according to the identified opportunities. Weighting values for individual ones of the tasks may be assigned according to expected computational intensity values of the respective tasks. Then, computational resources may be distributed among the tasks according to the respective weighting values and the application executed using the distributed computational resources.

    Systems and Methods for Safely Subscribing to Locks Using Hardware Extensions

    公开(公告)号:US20240028424A1

    公开(公告)日:2024-01-25

    申请号:US18478820

    申请日:2023-09-29

    CPC classification number: G06F9/526 G06F9/467 G06F9/3851 G06F9/30087

    Abstract: Transactional Lock Elision allows hardware transactions to execute unmodified critical sections protected by the same lock concurrently, by subscribing to the lock and verifying that it is available before committing the transaction. A “lazy subscription” optimization, which delays lock subscription, can potentially cause behavior that cannot occur when the critical sections are executed under the lock. Hardware extensions may provide mechanisms to ensure that lazy subscriptions are safe (e.g., that they result in correct behavior). Prior to executing a critical section transactionally, its lock and subscription code may be identified (e.g., by writing their locations to special registers). Prior to committing the transaction, the thread executing the critical section may verify that the correct lock was correctly subscribed to. If not, or if locations identified by the special registers have been modified, the transaction may be aborted. Nested critical sections associated with different lock types may invoke different subscription code.

    SCALABLE RANGE LOCKS
    20.
    发明公开

    公开(公告)号:US20230252081A1

    公开(公告)日:2023-08-10

    申请号:US18183891

    申请日:2023-03-14

    CPC classification number: G06F16/9024 G06F11/3006 G06F16/1774

    Abstract: A computer comprising one or more processors and memory may implement multiple threads performing mutually exclusive lock acquisition operations on disjoint ranges of a shared resource each using atomic compare and swap (CAS) operations. A linked list of currently locked ranges is maintained and, upon entry to a lock acquisition operation, a thread waits for all locked ranges overlapping the desired range to be released then inserts a descriptor for the desired range into the linked list using a single CAS operation. To release a locked range, a thread executes a single fetch and add (FAA) operation. The operation may be extended to support simultaneous exclusive and non-exclusive access by allowing overlapping ranges to be locked for non-exclusive access and by performing an additional validation after locking to provide conflict resolution should a conflict be detected.

Patent Agency Ranking