Systems and methods of fault management in electronic communications

    公开(公告)号:US09619347B2

    公开(公告)日:2017-04-11

    申请号:US14629881

    申请日:2015-02-24

    Abstract: An apparatus includes: a physical-layer device that distributes data to first lanes and performs data transfer to/from an external device by second lanes each of which has a number of the first lanes; and a transfer circuit that transfers data output by a central-processing unit performing arithmetic-processing to the physical-layer device and transfers the data received from the physical-layer device and received by the central-processing unit, the transfer circuit that comprises an information-acquisition unit that receives one of detection information of the first lanes which indicates that the physical-layer device has received data from the external device and error information of the first lanes which indicates that the data transferred to the physical-layer device from the external device has an error, from the physical-layer device, and a selection unit configured to specify the second lane to be degenerated based on one of the error information and the detection information.

    Data Recovery In A Distributed Storage System
    244.
    发明申请

    公开(公告)号:US20170097875A1

    公开(公告)日:2017-04-06

    申请号:US14876063

    申请日:2015-10-06

    Applicant: NetApp, Inc.

    CPC classification number: G06F11/2069 G06F11/1088 G06F2201/805 G06F2201/85

    Abstract: A system, method, and machine-readable storage medium for recovering data in a distributed storage system are provided. In some embodiments, the method includes identifying a failing storage device of a first storage node having an inaccessible data segment. When it is determined that the inaccessible data segment cannot be recovered using a first data protection scheme, a first chunk of data associated with the inaccessible data segment is identified and a group associated with the first chunk of data is identified. A second chunk of data associated with the group is selectively retrieved from a second storage node such that data associated with an accessible data segment of the first storage node is not retrieved. The inaccessible data segment is recovered by recovering the first chunk of data using a second data protection scheme and the second chunk of data.

    MEMORY SYSTEM HAVING IDLE-MEMORY DEVICES AND METHOD OF OPERATING THEREOF
    247.
    发明申请
    MEMORY SYSTEM HAVING IDLE-MEMORY DEVICES AND METHOD OF OPERATING THEREOF 审中-公开
    具有空闲存储器件的存储器系统及其操作方法

    公开(公告)号:US20170068604A1

    公开(公告)日:2017-03-09

    申请号:US15048514

    申请日:2016-02-19

    Applicant: SK hynix Inc.

    CPC classification number: G06F11/141 G06F11/1446 G06F2201/805 G06F2201/85

    Abstract: The memory system may include a memory device including a plurality of sub-memory devices coupled to a channel; and a controller suitable for controlling the memory device to store a first data into a selected sub-memory device and at least one idle sub-memory device among the sub-memory devices during a first program operation to a selected sub-memory device among the sub-memory devices with the first data with a first data; and to perform a second program operation to the selected sub-memory device with the first data stored in the idle sub-memory device when the first program operation to the selected sub-memory device fails.

    Abstract translation: 存储器系统可以包括包括耦合到信道的多个子存储器设备的存储器设备; 以及控制器,适于控制所述存储设备,以在第一程序操作期间将所述第一数据存储到所选择的子存储器设备中以及所述子存储器设备中的至少一个空闲子存储器设备到所述子存储器设备 具有第一数据的第一数据的子存储器设备; 并且当对所选择的子存储器设备的第一程序操作失败时,使用存储在空闲子存储器设备中的第一数据对所选择的子存储器设备执行第二程序操作。

    Dynamic sizing of storage capacity for a remirror buffer
    248.
    发明授权
    Dynamic sizing of storage capacity for a remirror buffer 有权
    动态调整镜像缓冲区的存储容量

    公开(公告)号:US09582377B1

    公开(公告)日:2017-02-28

    申请号:US14663005

    申请日:2015-03-19

    CPC classification number: G06F11/201 G06F9/5011 G06F11/00 G06F2201/85

    Abstract: A remirror buffer can be used in failover situations so as to backup storage volumes in a service provider. The remirror buffer is dynamically resized to meet current usage metrics captured from a data center. A risk boundary can be defined through which resource hosts are grouped together so as to determine the usage metrics. The risk boundary can be based on a topology of the data center, such as a room/rack/sharing of power supplies, or other characteristics of the resource hosts.

    Abstract translation: 可以在故障转移情况下使用重映像缓冲区,以便在服务提供商中备份存储卷。 动态调整镜像缓冲区以满足从数据中心捕获的当前使用度量。 可以定义风险边界,将资源主机分组在一起,以确定使用度量。 风险边界可以基于数据中心的拓扑结构,例如房间/机架/电源共享或资源主机的其他特性。

    SYSTEM AND METHOD FOR SUPPORTING TRANSACTION AFFINITY BASED REQUEST HANDLING IN A MIDDLEWARE ENVIRONMENT
    249.
    发明申请
    SYSTEM AND METHOD FOR SUPPORTING TRANSACTION AFFINITY BASED REQUEST HANDLING IN A MIDDLEWARE ENVIRONMENT 审中-公开
    在中间环境中支持基于交易活动的请求处理的系统和方法

    公开(公告)号:US20170052855A1

    公开(公告)日:2017-02-23

    申请号:US15346135

    申请日:2016-11-08

    Abstract: A system and method can support transaction processing in a middleware environment. A processor, such as a remote method invocation stub in the middleware environment, can be associated with a transaction, wherein the transaction is from a first cluster. Then, the processor can handle a transactional request that is associated with the transaction, wherein the transactional request is to be sent to the first cluster. Furthermore, the processor can route the transactional request to a said cluster member in the first cluster, which is an existing participant of the transaction.

    Abstract translation: 系统和方法可以支持中间件环境中的事务处理。 诸如中间件环境中的远程方法调用存根的处理器可以与事务相关联,其中事务来自第一集群。 然后,处理器可以处理与事务相关联的事务请求,其中事务请求将被发送到第一集群。 此外,处理器可以将事务请求路由到作为事务的现有参与者的第一集群中的所述集群成员。

    PASSIVE DETECTION OF LIVE SYSTEMS DURING CONTROLLER FAILOVER IN DISTRIBUTED ENVIRONMENTS
    250.
    发明申请
    PASSIVE DETECTION OF LIVE SYSTEMS DURING CONTROLLER FAILOVER IN DISTRIBUTED ENVIRONMENTS 审中-公开
    在分布式环境中控制器故障时对被动系统的被动检测

    公开(公告)号:US20170046237A1

    公开(公告)日:2017-02-16

    申请号:US14823883

    申请日:2015-08-11

    CPC classification number: G06F11/2007 G06F11/00 G06F2201/85

    Abstract: For passive detection of live systems during controller failover in a distributed environment, a set of member systems is sorted according to heartbeat periods used by members in the set of member systems. An amount of elapsed time since a failure of a first controller system in the distributed environment is determined. From the sorted set, a first member system is selected due to a first heartbeat period of the first member system being a shortest heartbeat period in all heartbeat periods in the sorted set of member systems. Using a processor and a memory at a second controller system, a timeout period is computed. The timeout period is an amount of time remaining in the first heartbeat period after the amount of elapsed time. The first member system is removed from the sorted set after the timeout period expires and the first member system has not sent a heartbeat.

    Abstract translation: 为了在分布式环境中的控制器故障转移期间对活动系统的被动检测,一组成员系统将根据成员系统成员中使用的心跳周期进行排序。 确定在分布式环境中的第一控制器系统故障之后所经过的时间量。 从排序集合中,由于第一成员系统的第一心跳周期是成员系统的排序集合中的所有心跳周期中的最短心跳周期,所以选择第一成员系统。 在第二个控制器系统中使用处理器和存储器,计算超时周期。 超时时间是在经过时间之后的第一个心跳周期内剩余的时间量。 超时时间到期后第一个成员系统从排序集中删除,第一个成员系统没有发送心跳。

Patent Agency Ranking