클러스터 컴퓨터 시스템의 소프트웨어 가용도 개선 방법및 그 장치
    1.
    发明授权
    클러스터 컴퓨터 시스템의 소프트웨어 가용도 개선 방법및 그 장치 失效
    클러스터컴퓨터시스템의소프트웨어가용도개선방법및그장치

    公开(公告)号:KR100420266B1

    公开(公告)日:2004-03-02

    申请号:KR1020010065337

    申请日:2001-10-23

    Abstract: The invention relates to a method and apparatus for improving software availability of a cluster computer system via a software rejuvenation technique, in which a program is temporarily stopped at an adequate time point that a manager of a cluster computer system constituted by several servers can expect, and then restarted. In the invention, both aspects of software and hardware are considered, a proactive fault-tolerance technique is utilized via software rejuvenation and availability is improved through determination of the optimal rejuvenation period according to a software unstable rate and a hardware failure rate of the cluster system so that features of a high-available computer system can be ensured efficient in cost.

    Abstract translation: 本发明涉及一种用于通过软件更新技术来提高集群计算机系统的软件可用性的方法和设备,其中程序在适当的时间点被暂时停止,由多个服务器构成的集群计算机系统的管理器可以期望, 然后重新启动。 在本发明中,考虑了软件和硬件的两个方面,通过软件更新来利用主动容错技术,并且通过根据集群系统的软件不稳定速率和硬件故障率确定最佳复兴时间段来提高可用性 这样可以保证高效的计算机系统的特性在成本上是有效的。

    클러스터 컴퓨터 시스템의 소프트웨어 가용도 개선 방법및 그 장치
    2.
    发明公开
    클러스터 컴퓨터 시스템의 소프트웨어 가용도 개선 방법및 그 장치 失效
    提高集群计算机系统软件可用性的方法与系统

    公开(公告)号:KR1020030034411A

    公开(公告)日:2003-05-09

    申请号:KR1020010065337

    申请日:2001-10-23

    Abstract: PURPOSE: A software availability enhancing method and a system for the same are provided to enable a manager to temporarily stop executing a program at a proper time, e.g. when a system is used by few users or a system failure is expected, and to execute the program again so that it can prevent an unexpected system failure. CONSTITUTION: The method comprises several steps. It is checked whether there exists an unstable main server by using server state data. If there exists an unstable main server, a server re-execution instruction is generated and is transmitted to a load distributer of a clustering module(S101). It is checked whether there exists a spare main server or capacity for re-executing the unstable main server(S102). If so, a current set mode is checked in the spare server. If the current set mode is an active/active mode, all the processes of the unstable main server are duplexed to the spare main server(S103). It is continuously checked whether the re-execution instruction has to be needed in the unstable main server by using monitoring data(S104). In a case that the re-execution instruction has to be needed, the unstable main server is eliminated from an available server list of the load distributer and the spare main server replaces it(S105). The unstable main server is re-executed via a file system cleaning, a buffer cleaning, a memory cleaning and a rebooting process, and offers available server registration data to a cluster controller(S106).

    Abstract translation: 目的:提供软件可用性增强方法及其系统,以使管理者能够在适当的时间暂时停止执行程序,例如, 当系统被少数用户使用或系统故障期望时,再次执行该程序,以防止意外的系统故障。 构成:该方法包括几个步骤。 使用服务器状态数据检查是否存在不稳定的主服务器。 如果存在不稳定的主服务器,则产生服务器再执行指令,并将其发送到群集模块的负载分配器(S101)。 检查是否存在用于重新执行不稳定主服务器的备用主服务器或容量(S102)。 如果是这样,则在备用服务器中检查当前设置模式。 如果当前设置模式为主动/主动模式,则不稳定主服务器的所有进程都会与备用主服务器进行双工(S103)。 通过使用监视数据不断检查在不稳定的主服务器中是否需要重新执行指令(S104)。 在需要重新执行指令的情况下,从负载分配器的可用服务器列表中消除不稳定的主服务器,并且备用主服务器将其替换(S105)。 不稳定的主服务器通过文件系统清理,缓冲清理,内存清理和重启过程重新执行,并向集群控制器提供可用的服务器注册数据(S106)。

Patent Agency Ranking