Abstract:
A system and method for managing resources in a distributed computer system that includes at least one resource pool for a set of virtual machines (VMs) utilizes a set of desired individual VM-level resource settings that corresponds to target resource allocations for observed performance of an application running in the distributed computer system. The set of desired individual VM-level resource settings are determined by constructing a model for the observed application performance as a function of current VM-level resource allocations and then inverting the function to compute the target resource allocations in order to meet at least one user-defined service level objective (SLO). The set of desired individual VM-level resource settings are used to determine final RP-level resource settings for a resource pool to which the application belongs and final VM-level resource settings for the VMs running under the resource pool, which are then selectively applied.
Abstract:
A system and method for performing automatic remediation in a distributed computer system with multiple clusters of host computers uses the same placement selection algorithm for initial placements and for remediation placements of clients. The placement selection algorithm is executed to generate a placement solution when a remediation request in response to a remediation-requiring condition in the distributed computer system for at least one client running in one of the multiple clusters of host computers is detected and a remediation placement problem for the client is constructed. The placement solution is then implemented for the client for remediation.
Abstract:
A system and method for performing customized remote resource allocation analyses on distributed computer systems utilizes a snapshot of a distributed computer system, which is received at a remote resource allocation module, to perform a resource allocation analysis using a resource allocation algorithm. The resource allocation algorithm is selected from a plurality of resource allocation algorithms based on at least one user-provided parameter associated with the distributed computer system.
Abstract:
A cloud management server and method for performing automatic placement of clients in a distributed computer system uses a list of compatible clusters to select an affinity cluster to place the clients associated with an affinity constraint. As part of the placement method, a cluster that cannot satisfy any anti-affinity constraint associated with the clients and the affinity constrain is removed from the list of compatible clusters. After the affinity cluster has been selected, at least one cluster in the distributed computer system is also selected to place clients associated with an anti-affinity constraint.
Abstract:
A cloud management server and method for performing automatic placement of clients in a distributed computer system uses a list of compatible clusters to select an affinity cluster to place the clients associated with an affinity constraint. As part of the placement method, a cluster that cannot satisfy any anti-affinity constraint associated with the clients and the affinity constrain is removed from the list of compatible clusters. After the affinity cluster has been selected, at least one cluster in the distributed computer system is also selected to place clients associated with an anti-affinity constraint.
Abstract:
An automatic scaling system and method for reducing state space in reinforced learning for automatic scaling of a multi-tier application uses a state decision tree that is updated with new states of the multi-tier application. When a new state of the multi-tier application is received, the new state is placed in an existing node of the state decision tree only if a first attribute of the new state is same as a first attribute of any state contained in the existing node and a second attribute of the new state is sufficiently similar to a second attribute of each existing state contained in the existing node based on a similarity measurement of the second attribute of each state contained in the existing node with the second attribute of the new state.
Abstract:
A system and method for managing resources in a distributed computer system that includes at least one resource pool for a set of virtual machines (VMs) utilizes a set of desired individual VM-level resource settings that corresponds to target resource allocations for observed performance of an application running in the distributed computer system. The set of desired individual VM-level resource settings are determined by constructing a model for the observed application performance as a function of current VM-level resource allocations and then inverting the function to compute the target resource allocations in order to meet at least one user-defined service level objective (SLO). The set of desired individual VM-level resource settings are used to determine final RP-level resource settings for a resource pool to which the application belongs and final VM-level resource settings for the VMs running under the resource pool, which are then selectively applied.
Abstract:
Systems and methods for finding solutions exhaustively in distributed load balancing are provided. A plurality of virtual machines (VMs) is in communication with a virtual machine management server (VMMS). The VMMS is configured to generate a matrix that represents a mapping of a plurality of virtual machines (VMs) to a plurality of hosts and to calculate a first imbalance metric of the matrix. The VMMS is also configured to identify a plurality of candidate migrations the VMs. The VMMS searches through the solution space efficiently and can perform an exhaustive search to find the optimal solution. For each candidate migration, the VMMS is configured to alter the matrix to represent the candidate migration and to calculate a candidate imbalance metric based on the altered matrix. The VMMS is also configured to determine which candidate migration to perform based at least in part on the candidate imbalance metric for each candidate migration and the first imbalance metric.
Abstract:
A cloud management server and method for performing automatic placement of clients in a distributed computer system uses a list of compatible clusters to select an affinity cluster to place the clients associated with an affinity constraint. As part of the placement method, a cluster that cannot satisfy any anti-affinity constraint associated with the clients and the affinity constrain is removed from the list of compatible clusters. After the affinity cluster has been selected, at least one cluster in the distributed computer system is also selected to place clients associated with an anti-affinity constraint.
Abstract:
The current document is directed to an analysis subsystem within a large distributed computing system, such as a virtual data center or cloud-computing facility, that monitors the operational states associated with a multi-tiered application and provides useful information for determining one or more causes of various types of failures and undesirable operational states that may arise during operation of the multi-tiered application. In one implementation, the analysis subsystem collects metrics provided by various different types of metrics sources within the computational system and employs principal feature analysis to select a generally small subset of the collected metrics particularly relevant to monitoring a multi-tiered application and diagnosing underlying causes of operational states of the multi-tiered application. The analysis subsystem develops one or more conditional probability distributions with respect to the subset of metrics. These one or more conditional probability distributions, in turn, allow the analysis subsystem to provide useful information for analysis of the causes of failures and undesirable system states associated with the multi-tiered application.