Abstract:
Techniques for optimizing virtual machine (VM) storage performance in a hyper-converged infrastructure (HCI) deployment comprising a stretched cluster of host systems are provided. In one set of embodiments, a computer system can identify one or more stretched VMs in the stretched cluster, where storage objects associated with the one or more stretched VMs are replicated across the sites of the cluster. The computer system can further determine, for each stretched VM, whether a greater number of the VM's storage objects are accessible to the VM via site-local replica copies residing at a first site where the VM is currently running, or via site-remote replica copies residing at a second site where the VM is not currently running. If a greater number of the VM's storage objects are accessible to the VM via the site-remote replica copies, the VM can be migrated from the first site to the second site.
Abstract:
Techniques for optimizing cluster-wide operations in a hyper-converged infrastructure (HCI) deployment are provided. In one set of embodiments, a computer system can receive a request to initiate a cluster-wide operation on a cluster of the HCI deployment, where the cluster includes a plurality of host systems, and where the cluster-wide operation involves a host-by-host evacuation of virtual machines (VMs) and storage components from the plurality of host systems. The computer system can further generate a set of recommendations for executing the host-by-host evacuation in a manner that minimizes the total amount of time needed to complete the cluster-wide operation. The computer system can then execute the host-by-host evacuation in accordance with the set of recommendations.
Abstract:
An example method of placing a virtual machine (VM) in a cluster of hosts is described. Each of the hosts having a hypervisor managed by a virtualization management server for the cluster, the hosts separated into a plurality of nonuniform memory access (NUMA) domains. The method including: comparing a virtual central processing unit (vCPU) and memory configuration of the VM with physical NUMA topologies of the hosts; selecting a set of the hosts spanning at least one of the NUMA domains, each host in the set of hosts having a physical NUMA topology that maximizes locality for vCPU and memory resources of the VM as specified in the vCPU and memory configuration; and providing the set of hosts to a distributed resource scheduler (DRS) executing in the virtualization management server, the DRS configured to place the VM in a host selected from the set of hosts.
Abstract:
Techniques for ensuring sufficient available storage capacity for data resynchronization or data reconstruction in a cluster of a hyper-converged infrastructure (HCI) deployment are provided. In one set of embodiments, a computer system can receive a request to provision or reconfigure an object on the cluster. The computer system can further calculate one or more storage capacity reservations for one or more host systems in the cluster, where the one or more storage capacity reservations indicate one or more amounts of local storage capacity to reserve on the one or more host systems respectively in order to ensure successful data resynchronization or data reconstruction in the case of a host system failure or maintenance event. If placement of the object on the cluster will result in a conflict with the one or more storage capacity reservations, the computer system can deny the request to provision or reconfigure the object.
Abstract:
A method for adjusting the configuration of host computers in a cluster on which virtual machines are running in response to a failed change in state is disclosed. The method involves receiving at least one reason a change in state failed the present check or the future check, associating the at least one reason with at least one remediation action, wherein the remediation action would allow the change in state to pass both a present check and a future check, assigning the at least one remediation action a cost, and determining a set of remediation actions to perform based on the cost assigned to each remediation action. In an embodiment, the steps of this method may be implemented in a non-transitory computer-readable storage medium having instructions that, when executed in a computing device, causes the computing device to carry out the steps.
Abstract:
A method for supporting a change in state within a cluster of host computers that run virtual machines is disclosed. The method involves identifying a change in state within a cluster of host computers that run virtual machines, determining if predefined criteria for available resources within the cluster of host computers can be met by resources available in the cluster of host computers, and determining if predefined criteria for available resources within the cluster of host computers can be maintained after at least one different predefined change in state. In an embodiment, the steps of this method may be implemented in a non-transitory computer-readable storage medium having instructions that, when executed in a computing device, causes the computing device to carry out the steps.
Abstract:
A method for storage management of an object among a plurality of storage devices of a datacenter is provided. The method, in response to receiving an input on a selection item presented through a UI, determines that a manual storage management of an object is selected. The method then receives a storage policy for storing the object. Based on the storage policy, the method defines a plurality of components for the object and determines whether a set of one or more storage resources is available for storing the plurality of components. When the method determines that the set is available, for each component, the method presents the set of storage resources, receives a selection of a storage resource in the set to store the component, and updates the set based on the policy and the selection before presenting the updated set to select from for storing a next component.
Abstract:
Techniques for orchestrating and prioritizing the rebuild of storage object components in a hyper-converged infrastructure (HCI) deployment comprising a cluster of host systems are provided. In one set of embodiments, a computer system can identify a list of storage object components impacted by a maintenance event or failure of a host system in the cluster. The computer system can further determine a priority class for each storage object component in the list, where the determined priority class is based on a virtual machine (VM)-level priority class assigned to a VM to which the storage object component belongs. The computer system can then initiate rebuilds of the storage object components in the list on a per-VM and per-priority class basis, such that: (1) the rebuilds of storage object components belonging to the same VM are initiated consecutively, and (2) the rebuilds of storage object components with higher priority classes are initiated before the rebuilds of storage object components with lower priority classes.
Abstract:
In certain embodiments, a computer system can create first and second pluralities of disk groups in a hyperconverged infrastructure (HCI) cluster, where each disk group in the first plurality has capacity storage devices of a first type and each disk group in the second plurality has capacity storage devices of a second type. The computer system can further tag each disk group in the first plurality with a first disk group tag, tag each disk group in the second plurality with a second disk group tag, and create a storage policy that includes a placement rule identifying the first disk group tag. Then, at a time of provisioning a virtual machine (VM) in the HCI cluster that is associated with the storage policy, the computer system can place the VM on one or more of the first plurality of disk groups in accordance with the placement rule identifying the first disk group tag.
Abstract:
Techniques for ensuring sufficient available storage capacity for data resynchronization or data reconstruction in a cluster of a hyper-converged infrastructure (HCI) deployment are provided. In one set of embodiments, a computer system can receive a request to provision or reconfigure an object on the cluster. The computer system can further calculate one or more storage capacity reservations for one or more host systems in the cluster, where the one or more storage capacity reservations indicate one or more amounts of local storage capacity to reserve on the one or more host systems respectively in order to ensure successful data resynchronization or data reconstruction in the case of a host system failure or maintenance event. If placement of the object on the cluster will result in a conflict with the one or more storage capacity reservations, the computer system can deny the request to provision or reconfigure the object.