INFRASTRUCTURE INDEPENDENT SELF-CONFIGURING MANAGEMENT NETWORK

    公开(公告)号:US20250130961A1

    公开(公告)日:2025-04-24

    申请号:US18399493

    申请日:2023-12-28

    Inventor: Charles Horvath

    Abstract: a first input/output (IO) module and a computer system including a first processor, a first computer system including a first processor, a first memory, a first operating system, a first baseboard management controller (BMC), a first network controller, and a management network including a set of network endpoints wherein each network endpoint has a first address, wherein each network endpoint is an internal device connected to the management network, and wherein the first operating system is interfaced with the first BMC through a first interface supported by the first BMC.

    Real-time fault-tolerant checkpointing

    公开(公告)号:US11288143B2

    公开(公告)日:2022-03-29

    申请号:US17003808

    申请日:2020-08-26

    Abstract: In part, the disclosure relates to a real-time fault tolerant system. The system may include a first computing device, a second computing, and a hardware interconnect. The first computing device may include one or more memory devices, one or more processors, a first network interface operable to receive device data and transmit output data over a time-slot-based bus, wherein the output data is generated from processing device data, and a first real-time checkpoint engine. The second computing device may include similar components or the same components as the first computing device. The hardware interconnect is operable to permit data exchange between the first computing device and the second computing device. Checkpoints may be generated by checkpoint engines during lower-priority communication time slots allocated on the time slot-based bus to avoid interfering with any real-time communications to or from the first and second computing devices.

    COST REDUCED HIGH RELIABILITY FAULT TOLERANT COMPUTER ARCHITECTURE

    公开(公告)号:US20250130721A1

    公开(公告)日:2025-04-24

    申请号:US18399469

    申请日:2023-12-28

    Abstract: In part, in one aspect, the disclosure relates to a first computer system including a first processor and first memory, a first IO storage subsystem including a first switch configured for one or more first storage devices, a first IO non-storage subsystem including a first which configured for one or more first non-storage devices, a second compute system including a second processor and second memory, a second storage IO subsystem including a second switch configured for one or more second storage devices, a second IO non-storage subsystem including a second switch configured for one or more second non-storage devices and a midplane including a power connector, a processor side and an IO side, wherein the processing side includes connectors in electrical communication with the computer systems, the IO side includes connectors in electrical communication with the storage and non-storage subsystems.

    FAULT TOLERANT SYSTEMS AND METHODS USING SHARED MEMORY CONFIGURATIONS

    公开(公告)号:US20240176739A1

    公开(公告)日:2024-05-30

    申请号:US18072297

    申请日:2022-11-30

    CPC classification number: G06F12/0815 G06F12/0891 G06F2212/1032

    Abstract: In part, the disclosure relates to a fault tolerant system. The system may include one or more shared memory complexes, each memory complex comprising a group of M computer-readable memory storage devices; one or more cache coherent switches comprising two or more host ports and one or more downstream device ports, the cache coherent switch in electrical communication with the one or more shared memory storage device; a first management processor in electrical communication with the cache coherent switch; a first compute node comprising a first processor and a first cache, the first compute node in electrical communication with the one or more cache coherent switches and the one or more shared memory complexes; a second compute node comprising a second processor and a second cache, the second compute node in electrical communication with the one or more cache coherent switches and the one or more shared memory complexes.

    High reliability fault tolerant computer architecture

    公开(公告)号:US11586514B2

    公开(公告)日:2023-02-21

    申请号:US16536745

    申请日:2019-08-09

    Abstract: A fault tolerant computer system and method are disclosed. The system may include a plurality of CPU nodes, each including: a processor and a memory; at least two IO domains, wherein at least one of the IO domains is designated an active IO domain performing communication functions for the active CPU nodes; and a switching fabric connecting each CPU node to each IO domain. One CPU node is designated a standby CPU node and the remainder are designated as active CPU nodes. If a failure, a beginning of a failure, or a predicted failure occurs in an active node, the state and memory of the active CPU node are transferred to the standby CPU node which becomes the new active CPU node. If a failure occurs in an active IO domain, the communication functions performed by the failing active IO domain are transferred to the other IO domain.

    System and Methods of Managing and Recognizing PCI Devices in an Active System

    公开(公告)号:US20250130969A1

    公开(公告)日:2025-04-24

    申请号:US18399481

    申请日:2023-12-28

    Inventor: Lei Cao

    Abstract: In part, in one aspect, the disclosure relates to a method of enumerating a device relative to a computer system. The method may include sending, using a platform driver of an operating system (OS), a PCI memory range to self-enumeration (SE) firmware once it detects a new PCIe module; establishing a communication channel between the platform driver and SE firmware; detecting, using the platform driver, the new PCIe module via the PCI hotplug capability of the OS; and configuring a communication device, using the platform driver, on the PCIe module to establish a communication channel to the SE firmware. In some embodiments, the communication device is a synthetic device on the PCIe switch that allows low bandwidth bi-directional communication.

    SYSTEMS AND METHODS TO PRECONFIGURE A HARDWARE MODULE

    公开(公告)号:US20250130967A1

    公开(公告)日:2025-04-24

    申请号:US18399475

    申请日:2023-12-28

    Inventor: Steven Haid

    Abstract: In part, in one aspect, the disclosure relates to a method of enabling hot plugging operations in a computer system. The method mat include initializing a platform driver, wherein the platform driver is installed on a computer system running an operating system; disabling, using the platform driver, an interrupt function and a hardware detection notification function of the computer system; hot plugging a PCIe node in the computer system; detecting, using the platform driver, the hot plugging of the PCI node in the computer system; configuring the PCIe node, using the platform driver, wherein the configured PCIe node comprises a PCI hierarchy, the PCI hierarchy comprising an allocation of PCI bridges and devices of the configured PCIe node; and bringing the devices of the hot plugged configured PCIe node into service for use by the computing system.

    METHOD OF USING A MINIMALLY MODIFIED COMPUTER AS LIVE MIGRATION RECIPIENT

    公开(公告)号:US20250130906A1

    公开(公告)日:2025-04-24

    申请号:US18399453

    申请日:2023-12-28

    Inventor: Derek Shute

    Abstract: In part, in one aspect, the disclosure relates to a preparing for a standby compute node for a migration event in a fault tolerant system. The method may include receiving a migration instruction at the standby compute node of the fault tolerant system, the standby compute node includes a processor, a memory and an operating system; initiating a system service on the standby compute node, wherein the system service is configured to start a shutdown or a reboot process of the standby compute node; and quiescing one or more devices in communication with the standby node.

    SOFTWARE BASED VALIDATION OF MEMORY FOR FAULT TOLERANT SYSTEMS

    公开(公告)号:US20250130905A1

    公开(公告)日:2025-04-24

    申请号:US18399497

    申请日:2023-12-28

    Inventor: John MacLeod

    Abstract: In part, the disclosure relates to a computer system. The system includes a network device, a storage device and at least two compute nodes, wherein one compute node is designated an active node and the other compute node is designated a standby node. Each compute node may include a dedicated memory with an isolated utility executive, an operating system memory, and a firmware reserved memory and the active node operating system memory further includes an availability driver, wherein the availability driver of the active node disables or suspends all processes that alter operating system memory and transfers all operation system memory of the active node to the operating system memory of the standby node, the isolated utility executive of the active node executes a code to generate an active validation array set of all operating system memory, the active validation array is then transferred to the standby node.

Patent Agency Ranking