Abstract:
Embodiments relate to a method, system and program product for performing data processing. The system includes a plurality of computer servers configured to perform data processing, a client in processing communication with the computer servers and enabled to request data processing from any of the servers and a storing component included in the client for storing information relating to requested data to be processed. A processing component included in each computer server for applying a control lock to data being processed. A reprocessing request component is included in the client for enabling a new server to take over processing of requested data upon failure of previously processing computer server. The computer server obtains information relating to requested data from storing component and information relating to control lock information from the processing component such that the new computer server commences processing at a processing point exactly prior to the failure.
Abstract:
Embodiments of the invention provide systems and methods for recovering a failed data summarization. According to one embodiment, recovering a failed instance can comprise processing existing summarization instances identified as instances for which a new data summarization instance needs to wait. Upon a completion or a timeout of each of the instances identified as instances for which the new data summarization instance needs to wait, an exclusive lock can be acquired on a table storing scope information for the plurality of data summarization instances. One or more existing data summarization instances that match the new data summarization instance or that have an overlapping scope with the new data summarization instance can be processed, remaining tasks to be performed by the new data summarization instance can be defined, the exclusive lock can be released, and the remaining tasks to be performed by the new data summarization instance can be performed.
Abstract:
Techniques for ensuring deterministic thread context switching in a virtual machine application program include, in one embodiment, providing a single application-level mutex that threads of the executing application program are forced to acquire to execute application code of the virtual machine application program. During a first recorded execution of the virtual machine application program, a record is created and stored in a computer that indicates the order in which threads acquire the application-level mutex. In a subsequent replay execution of the virtual machine application program from the recording, threads of the virtual machine application program are managed to ensure that the application-level mutex is acquired by threads in the same order indicated in the record such that any race conditions that occurred during the recorded execution as a result of executing application code are reproduced during the subsequent replay execution thereby aiding application development personnel in identifying and isolating program errors and bugs related to race conditions.
Abstract:
A lock manager running on a machine may write a first entry for a first process to a queue associated with a resource. If the first entry is not at a front of the queue, the lock manager identifies a second entry that is at the front of the queue, and determines whether a second process associated with the second entry is operational. If the second process is not operational, the lock manager removes the second entry from the queue. Additionally, if the queue becomes unavailable, the lock manager may initiate failover to a backup copy of the queue.
Abstract:
A plurality of log processes are synchronized. Each is independently performed in parallel with one another, into a single set of log files. A line buffering mechanism of an operating system (OS) of the computing environment forecloses interleaving of the log processes. Log management operations are concurrently performed by a single process protected by a file-system lock of the OS. The log management operations include at least one of a log compression, log retention, and log rotation operation.
Abstract:
Recovery of inflowed transactions are provided by any instance in a cluster, along with peer recovery of transactions in a cluster, and administrative functionality related to these aspects. A method of managing transaction processing comprises performing transaction processing using a first process, wherein the first process logs the transaction processing that it performs, detecting failure of the first process, wherein the transaction logs of the first process are locked, taking ownership of the locked transaction logs of the first process at a second process, unlocking the locked transaction logs of the first process for use by the second process, and recovering at least one transaction using the transaction logs.
Abstract:
A serverless distributed file system manages the storage of files and directories using one or more directory groups. The directories may be managed using Byzantine-fault-tolerant groups, whereas files are managed without using Byzantine-fault-tolerant groups. Additionally, the file system may employ a hierarchical namespace to store files. Furthermore, the directory group may employ a plurality of locks to control access to objects (e.g., files and directories) in each directory.
Abstract:
A storage system 1 includes a first storage apparatus 100 and a second storage apparatus 100 communicatively coupled to an external apparatus 300. The first and second storage apparatuses respectively have first and second storage areas VDEVs selectively accessible from the external apparatus, first and second temporary storage areas 113, and remote copy controllers 1122 configured to control data copy process. The storage system includes a data I/O process authority information storage unit LDK storing data I/O process authority information. Either of the remote copy controllers reads the data I/O process authority information and copies according to the data I/O process authority information, to the other storage apparatus, data stored either in the first storage area and the first temporary storage area, or in the second storage area and the second temporary storage area that are included in the storage apparatus to which the remote copy controller belongs.
Abstract:
A technique for collecting and correlating locking data collects and correlates information on a plurality of programs waiting on or holding a plurality of resources in a multi-computer database system. The technique identifies a program executing on one computer of the multi-computer database system that is waiting on a resource. The technique also identifies a second program, executing on another computer, as the ultimate holder of the resource. An operator display screen displays information corresponding to the first program and the second program. The operator display screen may be switched between a multiline display format and a single line display format. The collection, identification, and display of the locking data is performed periodically, to allow the operator to discover locking problems and take a desired corrective action.
Abstract:
Techniques are provided for responding to the termination of a node by selecting another node, and assigning to the selected node the affinity relationships that existed between the terminated node and one or more objects. The resources that belong to the objects involved in the affinity relationships are remastered to the selected node. The selected node then performs recovery of the resources that had been opened by the terminated node and/or serves as a failover node to execute the transactions that had been executing on the terminated node.