METHOD AND APPARATUS FOR TOLERANCE OF LOST TIMER TICKS DURING RECOVERY OF A MULTI-PROCESSOR SYSTEM
    1.
    发明申请
    METHOD AND APPARATUS FOR TOLERANCE OF LOST TIMER TICKS DURING RECOVERY OF A MULTI-PROCESSOR SYSTEM 审中-公开
    在多处理器系统恢复期间损失定时器点火的方法和装置

    公开(公告)号:WO9834456A2

    公开(公告)日:1998-08-13

    申请号:PCT/US9801484

    申请日:1998-01-27

    CPC classification number: H04L69/28 G06F11/1425 H04L69/40

    Abstract: A method and apparatus for detecting and tolerating situations in which one or more processors in a multi-processor system cannot participate in timer-driven or timer-triggered protocols or event sequences. The multi-processor system includes multiple processors each having a respective memory. These processors are coupled by an inter-processor communication network (preferably consisting of redundant paths). Processors are suspected of having failed (ceased operations) outright or having a failed timer mechanism when other processors detect the absence of periodic "IamAlive" messages from other processors. When this happens, all of the processors in the system are subjected to a series of stages in which they repeatedly broadcast their status and their connectivity to each other. During the first such stage, according to the present invention, a processor will not assert its ability to participate unless its timer mechanism is working. It arms a timer expiration event and does not assert its health until and unless that timer expiration event occurs.

    Abstract translation: 一种用于检测和容忍多处理器系统中的一个或多个处理器不能参与定时器驱动或定时器触发的协议或事件序列的情况的方法和装置。 多处理器系统包括多个具有相应存储器的处理器。 这些处理器通过处理器间通信网络(优选地由冗余路径组成)耦合。 当其他处理器检测到不存在来自其他处理器的周期性“IamAlive”消息时,处理器被怀疑已经失败(停止操作)或定时器机制失败。 当这种情况发生时,系统中的所有处理器都经历一系列阶段,在这些阶段中,它们重复地广播其状态及其彼此的连接。 在第一个这样的阶段期间,根据本发明,处理器将不会断言其参与的能力,除非其定时器机制正在工作。 它会阻止定时器到期事件,并且除非发生定时器到期事件,否则不会断言其状态。

    METHOD AND APPARATUS FOR TOLERANCE OF LOST TIMER TICKS DURING RECOVERY OF A MULTI-PROCESSOR SYSTEM

    公开(公告)号:CA2275242A1

    公开(公告)日:1998-08-13

    申请号:CA2275242

    申请日:1998-01-27

    Abstract: A method and apparatus for detecting and tolerating situations in which one or more processors (112a, b, ..., n) in a multi-processor system cannot participate in timer-driven or timer-triggered protocols or event sequences. The multi-processor system includes multiple processors each having a respective memory (118a, b, ..., n). These processors are coupled by an interprocessor communication network (114) (preferably consisting of redundant paths). Processors are suspected of having failed (ceased operations) outright or having a failed timer mechanism when other processors detect the absence of periodic "IamAlive" messages from other processors. When this happens, all of the processors in the system are subjected to a series of stages in which they repeatedly broadcast their status and their connectivity to each other. During the first such stage, according to the present invention, a processor will not assert its ability to participate unless its timer mechanism is working. It arms a timer expiration event and does not assert its health until and unless that timer expiration event occurs.

    DISTRIBUTED AGREEMENT ON PROCESSOR MEMBERSHIP IN A MULTI-PROCESSOR SYSTEM

    公开(公告)号:CA2275241A1

    公开(公告)日:1998-07-30

    申请号:CA2275241

    申请日:1998-01-23

    Abstract: A system to determine the group of processors that will survive communications faults and/or timed-event failures in a multi-processor system (100). The processors (112), each having a memory (118) and connected to an interprocessor communication network (114), detect that the set of processors with which they can communicate has changed. They then choose to halt or continue operations based on minimizing the likelihood that disconnected groups of processors will continue to operate as independent systems on the initiation of a regroup operation (622b). A processor is suspected of having failed when other processors detect the absence of a periodic message from the processor (682). When this happens, all of the processors are subjected to a series of stages in which they repeatedly broadcast their status and connectivity to each other (830). The suspected processor does not advance through the stages to regroup if it has ceased operations or if its timer mechanism has failed.

    DIRECT BULK DATA TRANSFERS
    6.
    发明专利

    公开(公告)号:CA2190209A1

    公开(公告)日:1997-05-14

    申请号:CA2190209

    申请日:1996-11-13

    Abstract: A data processing system for transferring data is provided. This system includes central processing units (CPUs 20, 22, 24 and 26) and storage units (30 and 32 with 100-105 and 110-115) which are interconnected by a network (10). The CPUs (20, 22, 24 and 26) include a request process (133) and a storage process (130). The storage process (130) controls access to the storage unit (30 with 100-105 and 110-115). Software routines (220) are used to provide direct access to the storage unit (30 with 100-105 and 110-115) by the request CPU (22). The request CPU (20) is the CPU containing the request process (133). A virtual memory address for a buffer (160) of the request CPU (22) is created in the request CPU (22). The virtual memory address along with a storage unit access request are sent to the CPU (20) containing the storage process (130). A work request including the virtual memory address to sent from the storage process (130) to the storage unit (30 with 100-105 and 110-115). The data is then transferred directly between the request CPU (22) and the storage unit (30 with 100-105 and 110-115). The storage unit (30 with 100-105 and 110-115) then responds to the work request.

Patent Agency Ranking