Abstract:
A cluster configuration system arranged to manage a graph database for tracking and identifying a time-varying state of a cluster of objects. The graph database may include one or more nodes and one or more associations between the nodes to represent time-varying states of the cluster. Management of the graph database may include creating, maintaining, updating, storing, administrating, querying, and/or presenting one or more elements of the graph database.
Abstract:
Embodiments of the systems and techniques described here can leverage several insights into the nature of workload access patterns and the working-set behavior to reduce the memory overheads. As a result, various embodiments make it feasible to maintain running estimates of a workload's cacheability in current storage systems with limited resources. For example, some embodiments provide for a method comprising estimating cacheability of a workload based on a first working-set size estimate generated from the workload over a first monitoring interval. Then, based on the cacheability of the workload, a workload cache size can be determined. A cache then can be dynamically allocated (e.g., change, possibly frequently, the cache allocation for the workload when the current allocation and the desired workload cache size differ), within a storage system for example, in accordance with the workload cache size.
Abstract:
The techniques introduced here provide for efficient management of storage resources in a modern, dynamic data center through the use of virtual storage appliances. Virtual storage appliances perform storage operations and execute in or as a virtual machine on a hypervisor. A storage management system monitors a storage system to determine whether the storage system is satisfying a service level objective for an application. The storage management system then manages (e.g., instantiates, shuts down, or reconfigures) a virtual storage appliance on a physical server. The virtual storage appliance uses resources of the physical server to meet the storage related needs of the application that the storage system cannot provide. This automatic and dynamic management of virtual storage appliances by the storage management system allows storage systems to quickly react to changing storage needs of applications without requiring expensive excess storage capacity.
Abstract:
Technology for operating a cache sizing system is disclosed. In various embodiments, the technology monitors input/output (IO) accesses to a storage system within a monitor period; tracks an access map for storage addresses within the storage system during the monitor period; and counts a particular access condition of the IO accesses based on the access map during the monitor period. When sizing a cache of the storage system that enables the storage system to provide a specified level of service, the counting is for computing a working set size (WSS) estimate of the storage system.
Abstract:
Graph transformations are used by a data management system to correct violations of service-level objectives (SLOs) in a data center. In one aspect, a process is provided to manage a data center by receiving an indication of a violation of a service-level objective associated with the data center from a server in the data center. A graph representation and a transformations data container are retrieved by the data management system from data storage accessible to the data management system. The transformations data container includes one or more transformations. The transformation is processed to create a mutated graph from a data center representation from the graph representation. An option for managing the data center is determined as a result of evaluating the mutated graphs.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.
Abstract:
Analysis is performed on a collection of data that is recorded for the storage system during a first time frame. The recorded collection of data includes a plurality of performance parameters that are determined from, for example, diagnostic tools that continually operate on the storage system. A set of baseline values are determined for each of the plurality of performance parameters by analyzing the recorded collection of data from an older portion of the time frame. For each parameter, a set of performance parameter values obtained from a recent portion of the time frame is compared to a corresponding baseline value of that performance parameter. From performing the comparison, one or more anomalies that are indicative of a particular problem on the storage system are determined for one or more of the plurality of performance parameters.
Abstract:
Technology for operating a cache sizing system is disclosed. In various embodiments, the technology monitors input/output (IO) accesses to a storage system within a monitor period; tracks an access map for storage addresses within the storage system during the monitor period; and counts a particular access condition of the IO accesses based on the access map during the monitor period. When sizing a cache of the storage system that enables the storage system to provide a specified level of service, the counting is for computing a working set size (WSS) estimate of the storage system.