Abstract:
Methods, systems, and media for managing dynamic memory are disclosed. Embodiments may disclose identifying nodes with having memory for dynamic storage, and reserving a portion of the memory from the identified nodes for a heap pool. After generating a heap pool, embodiments may allocate dynamic storage from the heap pool to tasks received that are associated with one of the identified nodes. More specifically, embodiments identify the node or home node associated with the task, the amount of dynamic storage requested by the task, and create a heap object in the node associated with the task to provide the requested dynamic storage. Some embodiments involve de-allocating the dynamic storage assigned to the task upon receipt of an indication that the task is complete and the dynamic storage is no longer needed for the task. Several of such embodiments return the de-allocated dynamic storage to the heap pool for reuse.
Abstract:
A data collector for a massively parallel computer system obtains call-return stack traceback data for multiple nodes by retrieving partial call-return stack traceback data from each node, grouping the nodes in subsets according to the partial traceback data, and obtaining further call-return stack traceback data from a representative node or nodes of each subset. Preferably, the partial data is a respective instruction address from each node, nodes having identical instruction address being grouped together in the same subset. Preferably, a single node of each subset is chosen and full stack traceback data is retrieved from the call-return stack within the chosen node.
Abstract:
A method and apparatus creates and manages persistent memory (PM) in a multi-node computing system. A PM Manager in the service node creates and manages pools of nodes with various sizes of PM. A node manager uses the pools of nodes to load applications to the nodes according to the size of the available PM. The PM Manager can dynamically adjust the size of the PM according to the needs of the applications based on historical use or as determined by a system administrator. The PM Manager works with an operating system kernel on the nodes to provide persistent memory for application data and system metadata. The PM Manager uses the persistent memory to load applications to preserve data from one application to the next. Also, the data preserved in persistent memory may be system metadata such as file system data that will be available to subsequent applications.
Abstract:
Disclosed is an apparatus, method, and program product that enables distribution of operating system resources on a nodal basis in the same proportions as the expected system workload. The preferred embodiment of the present invention accomplishes this by assigning various types of weights to each node to represent their proportion of the overall balance within the system. Target Weights represent the desired distribution of the workload based on the existing proportions of processor and memory resources on each node. The actual workload balance on the system is represented by Current Weights, which the operating system strives to keep as close to the Target Weights as possible, on an ongoing basis. When the system is started, operating system services distribute their resources nodally in the same proportions as the Target Weights, and can request to be notified if the Target Weights ever change. If processors and/or memory are subsequently added or removed, new Target Weights are calculated at that time, and all services which requested notification are notified so they can redistribute their resources according to the new Target Weights or a stepwise refinement thereof.
Abstract:
Embodiments off the invention provide a mechanism for process migration on a massively parallel computer system. In particular, embodiments of the invention may be used to update process state data for a migrated compute node, such as MPI (or other communication library) state data, across a full collection of compute nodes present in a given parallel system executing a parallel task. Migrating a process form one compute node to another may be useful to address a variety of sub-optimal operating conditions. For example, one or more processes may be migrated to cure network congestion resulting from a poorly mapped task or when a compute node is predicted to experience a hardware failure.
Abstract:
Methods, systems, and media for managing dynamic memory are disclosed. Embodiments may disclose identifying nodes with having memory for dynamic storage, and reserving a portion of the memory from the identified nodes for a heap pool. After generating a heap pool, embodiments may allocate dynamic storage from the heap pool to tasks received that are associated with one of the identified nodes. More specifically, embodiments identify the node or home node associated with the task, the amount of dynamic storage requested by the task, and create a heap object in the node associated with the task to provide the requested dynamic storage. Some embodiments involve de-allocating the dynamic storage assigned to the task upon receipt of an indication that the task is complete and the dynamic storage is no longer needed for the task. Several of such embodiments return the de-allocated dynamic storage to the heap pool for reuse.
Abstract:
A data collector for a massively parallel computer system obtains call-return stack traceback data for multiple nodes by retrieving partial call-return stack traceback data from each node, grouping the nodes in subsets according to the partial traceback data, and obtaining further call-return stack traceback data from a representative node or nodes of each subset. Preferably, the partial data is a respective instruction address from each node, nodes having identical instruction address being grouped together in the same subset. Preferably, a single node of each subset is chosen and full stack traceback data is retrieved from the call-return stack within the chosen node.
Abstract:
Methods, systems, and media for managing dynamic memory are disclosed. Embodiments may disclose identifying nodes with having memory for dynamic storage, and reserving a portion of the memory from the identified nodes for a heap pool. After generating a heap pool, embodiments may allocate dynamic storage from the heap pool to tasks received that are associated with one of the identified nodes. More specifically, embodiments identify the node or home node associated with the task, the amount of dynamic storage requested by the task, and create a heap object in the node associated with the task to provide the requested dynamic storage. Some embodiments involve de-allocating the dynamic storage assigned to the task upon receipt of an indication that the task is complete and the dynamic storage is no longer needed for the task. Several of such embodiments return the de-allocated dynamic storage to the heap pool for reuse.
Abstract:
Embodiments of the present invention provide techniques for protecting critical sections of code being executed in a lightweight kernel environment suited for use on a compute node of a parallel computing system. These techniques avoid the overhead associated with a full kernel mode implementation of a network layer, while also allowing network interrupts to be processed without corrupting shared memory state. In one embodiment, a system call may be used to disable interrupts upon entry to a routine configured to process an event associated with the interrupt.
Abstract:
A filler neck closure system includes a capless fuel tank filler neck and a closure. The closure is coupled to the capless fuel tank filler neck to close an opening into the fuel tank filler neck.