Abstract:
Provided herein is a method and apparatus for host transparent storage controller failover and failback. A controller is capable of assuming the identity of a failed controller while continuing to respond to its own SCSI ID or IDs in such a way that all SCSI IDs and associated units (LUNS) of the failed controller are effectively taken over by the surviving controller. This "failover" behavior is transparent to any attached host computers and is treated by such attached hosts as a powerfail condition. The symmetric operation of returning the targets (IDs) and units (LUNs) to the previously failing controller ("failback") is likewise transparent.
Abstract:
An apparatus includes a first bus, a second bus, and a storage module having a first and second output with the first output being connected to the first bus and a second output being connected to the second bus. A first buffer storage and a second buffer storage in which the first buffer storage is connected to the first bus and the second buffer storage is connected to the second bus. The second buffer storage includes an error correction module. First and second network adapters are connected to the first and second buses respectively. The first network adapter also includes a connection to the first buffer. A processor in the apparatus includes a first processor circuitry for transferring the data using a first path through the first output in the storage module to the first buffer storage and from the first buffer storage to the first network adapter. A second processor circuitry is for transferring data using a second path through the second output to the second buffer storage through the error correction module and from the second buffer storage to the second network adapter, wherein the second processor circuitry is responsive to an error in the storage module.
Abstract:
A computer system in a fault-tolerant configuration employs multiple identical CPUs executing the same instruction stream, with multiple, identical memory modules in the address space of the CPUs storing duplicates of the same data. The multiple CPUs are loosely synchronized, as by detecting events such as memory references and stalling any CPU ahead of others until all execute the function simultaneously; interrupts can be synchronized by ensuring that all CPUs implement the interrupt at the same point in their instruction stream. I/O devices are accessed through a pair of identical (redundant) I/O processors, but only one is designated to actively control a given device; in case of failure of one I/O processor, however, an I/O device can be accessed by the other one without system shutdown, i.e., by merely redesignating the addresses of the registers of the I/O device under instruction control.
Abstract:
A computer system in a fault-tolerant configuration employs multiple identical CPUs executing the same instruction stream, with multiple, identical memory modules in the address space of the CPUs storing duplicates of the same data. The multiple CPUs are loosely synchronized, as by detecting events such as memory references and stalling any CPU ahead of others until all execute the function simultaneously; interrupts can be synchronized by ensuring that all CPUs implement the interrupt at the same point in their instruction stream. I/O devices are accessed through a pair of identical (redundant) I/O processors, but only one is designated to actively control a given device; in case of failure of one I/O processor, however, an I/O device can be accessed by the other one without system shutdown, i.e., by merely redesignating the addresses of the registers of the I/O device under instruction control.
Abstract:
A real time fault tolerant transaction processing system, particularly one suited for use in a service control point (SCP), is described. Specifically, the system utilizes a communication protocol, such as signalling system 7, that adaptively distributes message packets on an equal basis over multiple physical links that connect two points, such as an SCP and a signalling transfer point (STP), and non-fault tolerant front end and back end processors that are connected to each physical link for processing packets appearing on that link and providing corresponding responses thereto. All the front and back end processors are loosely coupled together for purposes of processor synchronization and re-assignment. Through this system, all the physical links simultaneously carry an equal number of packets which are, in turn, processed by all the processors connected thereto. In the event any physical link or either a front or back end processor connected thereto fails, then that link is declared to be out of service. Consequently, the protocol merely re-assigns all subsequently occurring packets to the other links until such time as the fault is cleared. As the result of link re-assignment, there is advantageously no need to connect a fault tolerant processor to each physical link. This, in turn, substantially and advantageously reduces the complexity and cost of the fault tolerant transaction processing system.
Abstract:
Binary logic is added to the binary logic normally utilized for the purpose of generating and decoding binary code combinations which reflect the order of use of a number of units, utilized in sequence, to thereby indicate the unit least recently used (LRU). Disclosed is the utilization of six binary bits which are updated in accordance with a sequence of use of four units to thereby indicate the least recently used one of the four units. In accordance with known LRU techniques, there are 24 valid binary bit combinations that reflect the sequence of use of the four units. The provision of 6 binary bits in the LRU code are capable of assuming 64 different permutations, therefore 40 combinations of binary bits are considered invalid when utilizing the LRU code. The present invention utilizes certain of the invalid binary bit combinations to identify units that have been removed from further use because of a fault condition, and which code continues to identify the sequence of use of those units which have not been eliminated from further use. The code chosen to identify a faulty unit and the sequence of use of the remaining units is fault tolerant in that additional errors in the coding mechanism can be tolerated, and ignored, while maintaining the ability to identify faulty units and sequence of use of the remaining units.
Abstract:
A computer controlled, data communications system is provided for transmitting data between a plurality of external devices. The system comprises up to eight, general purpose, digital computers, each with an associated disc-file storage system. The computers and the disc-file storage units are organized such that communication between the computers is made via the disc-file storage associated with each computer, not directly between the computers themselves. Each group of external devices is coupled to a primary and secondary computer such that upon the failure of the primary computer, the secondary computer will procss the data for its primary group of devices as well as for those external devices for which it is the secondary computer. Each disc-file storage unit is the primary storage for one computer and the copy storage for one additional computer. Upon the failure of a primary disc-file storage unit, a computer can operate with the copy storage unit.
Abstract:
A control device according to an embodiment includes a first region including a CPU and a sequence signal generation circuit; and a second region including an abnormality detection circuit, a sequence signal detection circuit, a first register storing multiple methods of recovery from occurrence of abnormality, and a second register for use to select one method of recovery from the multiple methods of recovery, the second region being higher in reliability than the first region. The sequence signal generation circuit converts a first signal that specifies the one method of recovery into a sequence signal containing a second signal, which is a digital signal of a predetermined pattern, and the sequence signal detection circuit changes a set value of the second register upon receiving the second signal.
Abstract:
Upon acquiring target business specifying information for specifying a target business, disaster recovery (DR) operation phase determination processing of calculating an operation phase is executed based on copy configuration information for managing a pair configuration of a target business use volume and a copy volume and a copy status table for managing a copy status in the target business use volume and the copy volume, a disaster pattern corresponding to a disaster situation of a volume of a disaster target having been damaged is calculated in accordance with an operation phase calculated by the DR operation phase determination unit, and a cloud use fee is calculated for each disaster pattern from a failure occurrence to completion of system recovery of a use site where a use volume is created.
Abstract:
The disclosed techniques reduce a responsiveness time for a secondary node state of a database in switching from a second computing node to replace a first computing node acting in a primary node state, with both computing nodes performing the same database queries. The second node receives information regarding queries performed by the first node while in the primary state. In some embodiments, the second node retrieves, from a transaction log, log records detailing operations performed for database transactions. In some embodiments, the second node inserts, based on the log records, data records of the transactions into an in-memory cache of the second node that stores chains of database records from different transactions. Upon receiving sufficient information to switch to the primary state, the second node changes a mode of operation during failover making a committed transaction available for reads by subsequent database queries prior to record reordering.