Abstract:
Graph transformations are used by a data management system to correct violations of service-level objectives (SLOs) in a data center. In one aspect, a process is provided to manage a data center by receiving an indication of a violation of a service-level objective associated with the data center from a server in the data center. A graph representation and a transformations data container are retrieved by the data management system from data storage accessible to the data management system. The transformations data container includes one or more transformations. The transformation is processed to create a mutated graph from a data center representation from the graph representation. An option for managing the data center is determined as a result of evaluating the mutated graphs.
Abstract:
A change in workload characteristics detected at one tier of a multi-tiered cache is communicated to another tier of the multi-tiered cache. Multiple caching elements exist at different tiers, and at least one tier includes a cache element that is dynamically resizable. The communicated change in workload characteristics causes the receiving tier to adjust at least one aspect of cache performance in the multi-tiered cache. In one aspect, at least one dynamically resizable element in the multi-tiered cache is resized responsive to the change in workload characteristics.
Abstract:
Collaborative management of shared resources is implemented by a storage server receiving, from a first resource manager, notification of a violation for a service provided by the storage server or device coupled to the storage server. The storage server further receives, from each of a plurality of resource managers, an estimated cost of taking a corrective action to mitigate the violation and selects a corrective action proposed by one of the plurality of resource managers based upon the estimated cost. The storage server directs the resource manager that proposed the selected corrective action to perform the selected corrective action.
Abstract:
Embodiments of the systems and techniques described here can leverage several insights into the nature of workload access patterns and the working-set behavior to reduce the memory overheads. As a result, various embodiments make it feasible to maintain running estimates of a workload's cacheability in current storage systems with limited resources. For example, some embodiments provide for a method comprising estimating cacheability of a workload based on a first working-set size estimate generated from the workload over a first monitoring interval. Then, based on the cacheability of the workload, a workload cache size can be determined. A cache then can be dynamically allocated (e.g., change, possibly frequently, the cache allocation for the workload when the current allocation and the desired workload cache size differ), within a storage system for example, in accordance with the workload cache size.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
It is detected that a metric associated with a first workload has breached a first threshold. It is determined that the first workload and a second workload access the same storage resources, wherein the storage resources are associated with a storage server. It is determined that the metric is impacted by the first workload and the second workload accessing the same storage resources. A candidate solution is identifier. An estimated impact of a residual workload is determined based, at least in part, on the candidate solution. A level of caching of at least one of the first workload or the second workload is adjusted based, at least in part, on the estimated impact of the residual workload.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.
Abstract:
Described herein is a system and method for dynamically managing service-level objectives (SLOs) for workloads of a cluster storage system. Proposed states/solutions of the cluster may be produced and evaluated to select one that achieves the SLOs for each workload. A planner engine may produce a state tree comprising nodes, each node representing a proposed state/solution. New nodes may be added to the state tree based on new solution types that are permitted, or nodes may be removed based on a received time constraint for executing a proposed solution or a client certification of a solution. The planner engine may call an evaluation engine to evaluate proposed states, the evaluation engine using an evaluation function that considers SLO, cost, and optimization goal characteristics to produce a single evaluation value for each proposed state. The planner engine may call a modeler engine that is trained using machine learning techniques.