Abstract:
A system and method for compression of partially ordered data sets is provided. A first record of the data set is compressed by encoding the record using a Fibonacci en coding technique. Thereafter, for each subsequent record N, the N-1st record is subtracted from the Nth record before encoding the result, thereby allowing each subsequent record to store the difference (or delta) from the previous record.
Abstract:
Scheduling operations such as asynchronous file system operations in a network storage system is accomplished by applying a bid-price online auction methodology, in which bid (willingness-to-pay) values and price (cost) values are dynamically set by storage clients and a storage server, respectively, based on utilization of computing resources. The system provides a framework for adaptively scheduling asynchronous file system operations, managing multiple key resources of the distributed file system, including network bandwidth, server I/O, server CPU, and client and server memory utilization. The system can accelerate, defer, or cancel asynchronous requests to improve application-perceived performance. Congestion pricing via online auctions can be employed to coordinate the use of system resources by clients, so clients can detect shortages and adapt their resource usage.
Abstract:
A network storage server implements a method to discard sensitive data from a Persistent Point-In-Time Image (PPI). The server first efficiently identifies a dataset containing the sensitive data from a plurality of datasets managed by the PPI. Each of the plurality of datasets is read-only and encrypted with a first encryption key. The server then decrypts each of the plurality of datasets, except the dataset containing the sensitive data, with the first encryption key. The decrypted datasets are re-encrypted with a second encryption key, and copied to a storage structure. Afterward, the first encryption key is shredded.
Abstract:
A system and method for compression of partially ordered data sets is provided. A first record of the data set is compressed by encoding the record using a Fibonacci en coding technique. Thereafter, for each subsequent record N, the N-1 st record is subtracted from the N th record before encoding the result, thereby allowing each subsequent record to store the difference (or delta) from the previous record.
Abstract:
DNS name resolution is integrated into each node in a network storage cluster, to allow load balancing of network addresses, using a weighted random distribution to resolve DNS requests. A node in the cluster gathers statistics on utilization of resources, such as CPU utilization and throughput, on nodes in the cluster and distributes those statistics to all other nodes. Each node uses the same algorithm to generate weights for the various IP addresses of the cluster, based on the statistics distributed to it. The weights are used to generate a weighted list of available network addresses. In response to a DNS request, a DNS in a given node randomly indexes into the weighted address list to resolve requests to a network address. The weights are chosen so that the DNS is likely to pick an IP address which has a low load, to balance port and node usage over time.
Abstract:
Techniques introduced here support block level transmission of a logical container from a network storage controller to a backup system. In accordance with the techniques, transmission can be restarted using checkpoints created at the block level by allowing restarts from various points within a logical container, for example a point at which 10%, 50%, or 75% of the logical container had been transmitted. The transmission can be restarted while maintaining data consistency of the logical container data and included meta-data. Advantageously, changes made prior to a checkpoint restart to, for example, meta-data, do not lead to inconsistent logical container backups.
Abstract:
Techniques are provided for tiering snapshots to archival storage in remote object stores. A restore time metric, indicating that objects comprising snapshot data of snapshots created within a threshold timespan are to be available within a storage tier of a remote object store for performing restore operations, may be identified. A scanner may be executed to evaluate snapshots using the restore time metric to identify a set of candidate snapshots for archival from the storage tier to an archival storage tier of the remote object store. For each candidate snapshot within the set of candidate snapshots, the scanner may evaluate metadata associated with the candidate snapshot to identity one or more objects eligible for archival from the storage tier to the archival storage tier, and may archive the one or more objects from the storage tier to the archival storage tier.
Abstract:
Techniques are provided for storing immutable snapshot copes in write once read many (WORM) storage. A snapshot of a volume may be stored into one or more objects formatted according to an object format. An expiry time may be assigned to the snapshot and the one or more objects based upon a creation time of the snapshot and a retention time. The one or more objects may be stored within a remote object store. The one or more objects are retained in an immutable state and cannot be deleted until expiration of the expiry time. In response to identifying an existing object within the remote object store comprising shared snapshot data referenced by the snapshot, an assigned expiry time of the existing object may be modified based upon the expiry time of the snapshot to create a modified expiry time for the existing object.
Abstract:
Systems, methods, and machine-readable media for predicting interruptions to the use of spare cloud resources and rebalancing based on those predictions are disclosed. A computing platform collects data for customers over time. The computing platform runs a machine learning algorithm on the historical data to generate a prediction classifier. The prediction classifier relates to a time window for prediction into the future, on the order of minutes or hours. The prediction classifier is run on monitored data from ongoing activity with a cloud provider to generate a risk score. Each risk score may identify an amount of risk that a spare cloud resource related to new resource metrics data will be interrupted within the future time frame corresponding to that prediction classifier. If predicted to be interrupted, the customer may be assisted in rebalancing to other resources. As a result, interruptions can be predicted hours into the future.
Abstract:
Techniques are provided for timestamp consistency. An operation targeting a first storage object having a synchronous replication relationship with a second storage object is intercepted. A timestamp is assigned to the operation. A replication operation is created as a replication of the operation. The same timestamp is assigned to the replication operation. The operation is implemented upon the first storage object and the replication operation is implemented upon the second storage object.