Abstract:
A sharded, permissioned, distributed ledger may reduce the amount of work and communication required by each participant, thus possibly avoiding scalability bottlenecks that may be inherent in previous distributed ledger implementations and possibly enabling the use of additional resources to translate to increased throughput. A sharded, permissioned, distributed ledger may be made up of multiple shards, each of which may also be a distributed ledger and which may operate in parallel. Participation within a sharded, permissioned, distributed ledger may be allowed only with permission of an authority. A sharded, permissioned, distributed ledger may include a plurality of nodes, each including a dispatcher configured to receive transaction requests from clients and to forward received requests to verifiers configured to append transactions to individual ones of the shards.
Abstract:
The systems and methods described herein may implement scalable statistics counters that are adaptive to the amount of contention for the counters. The counters may be accessible within transactions. Methods for determining whether or when to increment the counters in response to initiation of an increment operation and/or methods for updating the counters may be selected dependent on current, recent, or historical amounts of contention. Various contention management policies or retry conditions may be applied to select between multiple methods. One counter may include a precise counter portion that is incremented under low contention and a probabilistic counter portion that is updated under high contention. Amounts by which probabilistic counters are incremented may be contention-dependent. Another counter may include a node identifier portion that encourages consecutive increments by threads on a single node only when under contention. Another counter may be inflated in response to contention for the counter.
Abstract:
Particular techniques for improving the scalability of concurrent programs (e.g., lock-based applications) may be effective in some environments and for some workloads, but not others. The systems described herein may automatically choose appropriate ones of these techniques to apply when executing lock-based applications at runtime, based on observations of the application in the current environment and with the current workload. In one example, two techniques for improving lock scalability (e.g., transactional lock elision using hardware transactional memory, and optimistic software techniques) may be integrated together. A lightweight runtime library built for this purpose may adapt its approach to managing concurrency by dynamically selecting one or more of these techniques (at different times) during execution of a given application. In this Adaptive Lock Elision approach, the techniques may be selected (based on pluggable policies) at runtime to achieve good performance on different platforms and for different workloads.
Abstract:
We teach a powerful approach that greatly simplifies the design of non-blocking mechanisms and data structures, in part by, largely separate the issues of correctness and progress. At a high level, our methodology includes designing an “obstruction-free” implementation of the desired mechanism or data structure, which may then be combined with a contention management mechanism whose role is to facilitate the conditions under which progress of the obstruction-free implementation is assured. In general, the contention management mechanism is separable semantically from an obstruction-free concurrent shared/sharable object implementation to which it is/may be applied. In some cases, the contention management mechanism may actually be coded separately from the obstruction-free implementation. We elaborate herein on the notions of obstruction-freedom and contention management, and various possibilities for combining the two. In addition, we include description of some exemplary applications to particular concurrent software mechanisms and data structure implementations.
Abstract:
A system and method for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. More specifically, the disclosed embodiments provide a system that notifies a waiting thread when a targeted store is directed to monitored memory locations. During operation, the system receives a targeted store which is directed to a specific cache in a shared-memory multiprocessor system. In response, the system examines a destination address for the targeted store to determine whether the targeted store is directed to a monitored memory location which is being monitored for a thread associated with the specific cache. If so, the system informs the thread about the targeted store.
Abstract:
Particular techniques for improving the scalability of concurrent programs (e.g., lock-based applications) may be effective in some environments and for some workloads, but not others. The systems described herein may automatically choose appropriate ones of these techniques to apply when executing lock-based applications at runtime, based on observations of the application in the current environment and with the current workload. In one example, two techniques for improving lock scalability (e.g., transactional lock elision using hardware transactional memory, and optimistic software techniques) may be integrated together. A lightweight runtime library built for this purpose may adapt its approach to managing concurrency by dynamically selecting one or more of these techniques (at different times) during execution of a given application. In this Adaptive Lock Elision approach, the techniques may be selected (based on pluggable policies) at runtime to achieve good performance on different platforms and for different workloads.
Abstract:
The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor.
Abstract:
A sharded, permissioned, distributed ledger may reduce the amount of work and communication required by each participant, thus possibly avoiding scalability bottlenecks that may be inherent in previous distributed ledger implementations and possibly enabling the use of additional resources to translate to increased throughput. A sharded, permissioned, distributed ledger may be made up of multiple shards, each of which may also be a distributed ledger and which may operate in parallel. Participation within a sharded, permissioned, distributed ledger may be allowed only with permission of an authority. A sharded, permissioned, distributed ledger may include a plurality of nodes, each including a dispatcher configured to receive transaction requests from clients and to forward received requests to verifiers configured to append transactions to individual ones of the shards.
Abstract:
A sharded, permissioned, distributed ledger may reduce the amount of work and communication required by each participant, thus possibly avoiding scalability bottlenecks that may be inherent in previous distributed ledger implementations and possibly enabling the use of additional resources to translate to increased throughput. A sharded, permissioned, distributed ledger may be made up of multiple shards, each of which may also be a distributed ledger and which may operate in parallel. Participation within a sharded, permissioned, distributed ledger may be allowed only with permission of an authority. A sharded, permissioned, distributed ledger may include a plurality of nodes, each including a dispatcher configured to receive transaction requests from clients and to forward received requests to verifiers configured to append transactions to individual ones of the shards.
Abstract:
Transactional Lock Elision allows hardware transactions to execute unmodified critical sections protected by the same lock concurrently, by subscribing to the lock and verifying that it is available before committing the transaction. A “lazy subscription” optimization, which delays lock subscription, can potentially cause behavior that cannot occur when the critical sections are executed under the lock. Hardware extensions may provide mechanisms to ensure that lazy subscriptions are safe (e.g., that they result in correct behavior). Prior to executing a critical section transactionally, its lock and subscription code may be identified (e.g., by writing their locations to special registers). Prior to committing the transaction, the thread executing the critical section may verify that the correct lock was correctly subscribed to. If not, or if locations identified by the special registers have been modified, the transaction may be aborted. Nested critical sections associated with different lock types may invoke different subscription code.