Abstract:
A technique for emulation of a data storage system. The invention allows the level of services to be provided by a data storage system to be specified in terms of the level of services provided by another storage system. In one aspect, a performance characterization of a data storage device to be emulated is obtained (e.g., by experimental techniques). A specification of a workload is also obtained that includes a specification of a plurality of data stores for the workload. The data stores are assigned to an emulation data storage device according to the performance characterization and according to the specification of the workload such that sufficient resources of the emulation data storage device are allocated to the workload to meet the performance characterization of the data storage device to be emulated. The emulation data storage device is then operated under the workload. Quality-of-service (QoS) control may be performed so as to provide a degree of performance isolation among the workloads.
Abstract:
Provided is a method for determining a recovery schedule. The method includes accepting as input a recovery graph. The recovery graph presents one or more strategies for data recovery. In addition, at least one objective is provided and accepted. The recovery graph is formalized as an optimization problem for the provided objective. When formalized as an optimization problem, at least one solution technique is applied to determine at least one recovery schedule.
Abstract:
A transactional shared memory system has a plurality of discrete application nodes; a plurality of discrete memory nodes; a network interconnecting the application nodes and the memory nodes, and a controller for directing transactions in a distributed system utilizing the shared memory. The memory nodes collectively provide an address space of shared memory that is provided to the application nodes via the network. The controller has instructions to transfer a batched transaction instruction set from an application node to at least one memory node. This instruction set includes one or more write, compare and read instruction subsets, and/or combinations thereof. At least one subset has a valid non null memory node identifier and memory address range. The memory node identifier may be indicated by the memory address range. The controller controls the memory node responsive to receipt of the batched transaction instruction set, to safeguard the associated memory address range during execution of the transaction instruction set. The batched transaction instruction set is collectively executed atomically. A notification instruction set may also be used to establish a notification, triggered upon a subsequent write event upon at least a portion of a specified address range.
Abstract:
A method of reading data comprises sending read messages to storage devices holding the stripe and receiving at least a quorum of reply messages. The reply message from the storage device holding the data block includes the data block. The quorum meets a quorum condition of a number such that any two selections of the number of stripe blocks intersect in the minimum number of the stripe blocks needed to decode the stripe. A method of writing data comprises sending query messages to storage devices holding the stripe, receiving a query reply message from each of at least a first quorum of the storage devices, sending modify messages to the storage devices, and receiving a write reply message from each of at least a second quorum of the storage devices. The first and second quorums each meet the quorum condition.
Abstract:
A method of recovering a stripe of erasure coded data begins with sending query messages to storage devices. The method continues with receiving query reply messages from at least a first quorum of the storage devices. The query reply messages include a minimum number of the stripe blocks needed to decode the stripe. Following this, the stripe of erasure coded data is encoded. Next, a write message is sent to each of the storage devices, which include a timestamp and the stripe block destined for the storage device. The method concludes with receiving a write reply message from at least a second quorum of the storage devices indicating that the stripe block was successfully stored. The first and second quorums each meet a quorum condition of a number such that any two selections of the number of the stripe blocks intersect in the minimum number of the stripe blocks.
Abstract:
Data structure and timestamp management techniques for redundant storage. A plurality of storage devices are interconnected by a communication medium. At least two of the storage devices are designated devices for storing a block of data. Each designated device stores a version of the data and a first timestamp that is indicative of when the version of data was last updated. A second timestamp is indicative of a pending update to the block of data. When the update to the block of data is completed at one of the designated devices, the device discards the second timestamp. A storage device acting as coordinator instructs the device to discard the second timestamp. The designated storage devices store a plurality of blocks of data and corresponding timestamps according to a data structure. At least some of the entries in the data structure correspond to a range of data blocks that share a common timestamp. Entries in the data structure are arranged such that the ranges do not overlap.
Abstract:
An embodiment of a method of restoring data begins with a step of restoring point-in-time data from a local copy. The method concludes with a step of restoring at least a portion of an incremental difference between the point-in-time data and a desired state of the data from a remote mirror.
Abstract:
A computer storage system includes a controller and a storage device array. The storage device array includes a first sub-array and a fast storage device sub-array. The first sub-array includes one or more log-structured storage devices storing data. The fast storage device sub-array includes one or more fast storage devices storing a copy of the data stored in the first sub-array.
Abstract:
A method predicts performance of a system that includes a plurality of interconnected components defining at least one data flow path. The method references a workload specification for the system. The method models the system using one or more component models. Each component model represents selected one or more of the components. Each component model is arranged in like relationship to the data flow path as the selected one or more of the components represented by the component model. Each component model is (a) a constraint upon the workload specification input to that component model or (b) a transformer of the workload specification input to that component model so as to result in one or more output workload specifications that are input workload specifications to subsequent component models along the data flow path or (c) both a constraint and a transformer. At least one of the component models is a constraint. At least some of the component models along the data flow path operate on the workload specification. In one preferred form, operating on the workload specification involves arranging the component models in a hierarchy corresponding to the data flow path; using the specified workload specification as input to the topmost component model in the hierarchy; and applying one or more of the component models to its input workload specification, starting with the topmost component model and then component models at progressively lower levels in the hierarchy. Output workload specification at one level is input workload specification at the next lower level. If the component model comprises a constraint, the method evaluates whether the input workload specification satisfies or violates the constraint. If the component model comprises a workload specification transform, the method modifies the input workload specification so as to produce one or more output workload specifications.
Abstract:
A method of recovering a stripe of erasure coded data begins with sending query messages to storage devices. The method continues with receiving query reply messages from at least a first quorum of the storage devices. The query reply messages include a minimum number of the stripe blocks needed to decode the stripe. Following this, the stripe of erasure coded data is encoded. Next, a write message is sent to each of the storage devices, which include a timestamp and the stripe block destined for the storage device. The method concludes with receiving a write reply message from at least a second quorum of the storage devices indicating that the stripe block was successfully stored. The first and second quorums each meet a quorum condition of a number such that any two selections of the number of the stripe blocks intersect in the minimum number of the stripe blocks.