Abstract:
A system management method for a management computer coupled to a computer system, the computer system including a plurality of computers, an operations system being built thereon the computer system, the operations system including a plurality of task nodes each having allocated thereto computer resources, the system management method including: a step of analyzing a configuration of the computer system for specifying a important node, which is an important task node in the operations system; a step of changing an allocation amount of the computer resources allocated to the important node for measuring a load of the operations system; a step of calculating a first weighting representing a strength of associations among the plurality of task nodes based on a measurement result of the load; and a step of specifying a range impacted by a change in the load of the important node based on the calculated first weighting.
Abstract:
An application associated with a processor reads a first value of a counter and a second value of the counter. The counter is indicative of a migration status of the application with respect to the processor. Responsive to determining that the first value of the counter does not equal the second value of the counter, the application ascertains whether a value of a hardware parameter associated with the processor has changed during a time interval. The migration status indicates a count of the number of times the application has migrated from one processor to another processor. The application determines the validity of a value of a performance monitoring unit derived from the hardware parameter in view of the application ascertaining whether the value of the hardware parameter has changed during the time interval.
Abstract:
A method, computer-readable medium, and system for monitoring usage of computing resources provisioned across multiple cloud providers and/or data centers are disclosed. Events associated with usage of a plurality of computing resources may be accessed, where the plurality of computing resources may implement a virtual machine, a plurality of virtual machines of a cloud computing environment, etc. The events may be associated with a start, a stop, a status change, etc., of the plurality of computing resources. The events may be used to generate usage data for the plurality of computing resources. The usage data may include historical data associated with previous usage of the plurality of computing resources. Additionally, the usage data may be displayed using a graphical user interface, thereby enabling monitoring and/or tracking of usage of computing resources provisioned across at least one cloud provider and/or at least one data center.
Abstract:
A mechanism for accessing and processing monitoring data resulting from customized monitoring of system activities. A method of embodiments of the invention includes invoking, via a Command-Line Interface (CLI) shell console, a performance monitor at a host computer system to perform monitoring of activities of a plurality of system components of one or more computer systems. The CLI shell console provides an abstraction layer for interfaces and further provides host performance information via a common interface independent of operating systems, monitoring use-cases, monitoring tools, or programming languages employed at the host computer system. The method further includes accessing monitoring data generated from monitoring of the activities by the performance monitor.
Abstract:
Various embodiments monitor system noise in a parallel computing system. In one embodiment, at least one set of system noise data is stored in a shared buffer during a first computation interval. The set of system noise data is detected during the first computation interval and is associated with at least one parallel thread in a plurality of parallel threads. Each thread in the plurality of parallel threads is a thread of a program. The set of system noise data is filtered during a second computation interval based on at least one filtering condition creating a filtered set of system noise data. The filtered set of system noise data is then stored.
Abstract:
Exemplary methods, apparatuses, and systems include a host computer selecting a first workload of a plurality of workloads running on the host computer to be subjected to an input/output (I/O) trace. The host computer determines whether to generate the I/O trace for the first workload for a first length of time or for a second length of time. The first length of time is shorter than the second length of time. The determination is based upon runtime history for the first workload, I/O trace history for the first workload, and/or workload type of the first workload. The host computer generates the I/O trace of the first workload for the selected length of time.
Abstract:
Techniques for detection and handling of virtual appliance failures. In one aspect, a method is implemented on a host platform on which a hypervisor (aka Virtual Machine Manager) and a plurality of virtual machines (VMs) are running, the plurality of VMs collectively hosting a plurality of Software Defined Networking (SDN) and/or Network Function Virtualization (NFV) appliances that are communicatively coupled via a virtual network. A software-based entity running on the host platform is configured to monitor the plurality of virtual network appliances to detect failures of the virtual network appliances. In response to detection of a virtual network appliance failure, messages containing configuration information are implemented to reconfigure packet flows to bypass the virtual network appliance that has failed.
Abstract:
A management server is provided in a computer system having one or more hosts, one or more storage systems and one or more switches, the hosts having a plurality of virtual machines, each virtual machine being defined according to a service level agreement. The management server is operable to manage the virtual machines and resources associated with the virtual machines; receive a notification of an event from a node in the computer system; determine if the event affects a service level agreement for any of the virtual machines defined in the computer system, the service level agreements listing required attributes for the corresponding virtual machines; allocate a new resource for a virtual machine whose service level agreement is affected by the event; and move the virtual machine whose service level agreement is affected by the event to the newly allocated resource.
Abstract:
A server device includes a virtualization control unit, a storing unit, and a transferring unit. The virtualization control unit operates a virtual machine that is a virtualized computer to control a migration of the virtual machine with another server device. The storing unit stores therein a log, in an associated manner with the virtual machine, that is created by the virtual machine. When the virtual machine is migrated to the other server device, the transferring unit transfers, to the other server device, the log of the virtual machine targeted for a migration stored in the storing unit.
Abstract:
Methods and systems for managing, storing, and serving data within a virtualized environment are described. In some embodiments, a data management system may manage the extraction and storage of virtual machine snapshots, provide near instantaneous restoration of a virtual machine or one or more files located on the virtual machine, and enable secondary workloads to directly use the data management system as a primary storage target to read or modify past versions of data. The data management system may allow a virtual machine snapshot of a virtual machine stored within the system to be directly mounted to enable substantially instantaneous virtual machine recovery of the virtual machine.