Collective communications apparatus and method for parallel systems
    1.
    发明授权
    Collective communications apparatus and method for parallel systems 有权
    用于并行系统的集体通信设备和方法

    公开(公告)号:US09477628B2

    公开(公告)日:2016-10-25

    申请号:US14040676

    申请日:2013-09-28

    CPC classification number: G06F13/4068 G06F9/52 G06F15/17318 G06F15/17325

    Abstract: A collective communication apparatus and method for parallel computing systems. For example, one embodiment of an apparatus comprises a plurality of processor elements (PEs); collective interconnect logic to dynamically form a virtual collective interconnect (VCI) between the PEs at runtime without global communication among all of the PEs, the VCI defining a logical topology between the PEs in which each PE is directly communicatively coupled to a only a subset of the remaining PEs; and execution logic to execute collective operations across the PEs, wherein one or more of the PEs receive first results from a first portion of the subset of the remaining PEs, perform a portion of the collective operations, and provide second results to a second portion of the subset of the remaining PEs.

    Abstract translation: 一种用于并行计算系统的集体通信装置和方法。 例如,设备的一个实施例包括多个处理器元件(PE); 集体互连逻辑以在运行时动态地在PE之间形成虚拟集体互连(VCI),而不在所有PE之间进行全局通信,VCI在PE之间定义逻辑拓扑,其中每个PE直接通信地耦合到仅一个子集 余下的PE; 以及用于在所述PE之间执行集合操作的执行逻辑,其中所述PE中的一个或多个从所述剩余PE的子集的第一部分接收到第一结果,执行所述集体操作的一部分,并且将第二结果提供给 其余PE的子集。

    Cache coherency and processor consistency
    3.
    发明授权
    Cache coherency and processor consistency 有权
    缓存一致性和处理器一致性

    公开(公告)号:US09195465B2

    公开(公告)日:2015-11-24

    申请号:US13729629

    申请日:2012-12-28

    Abstract: Responsive to execution of a computer instruction in a current translation window, state indicators associated with a cache line accessed for the execution may be modified. The state indicators may include: a first indicator to indicate whether the computer instruction is a load instruction moved from a subsequent translation window into the current translation window, a second indicator to indicate whether the cache line is modified in a cache responsive to the execution of the computer instruction, a third indicator to indicate whether the cache line is speculatively modified in the cache responsive to the execution of the computer instruction, a fourth indicator to indicate whether the cache line is speculatively loaded by the computer instruction, a fifth indicator to indicate whether a core executing the computer instruction exclusively owns the cache line, and a sixth indicator to indicate whether the cache line is invalid.

    Abstract translation: 响应于在当前翻译窗口中执行计算机指令,可以修改与为执行访问的高速缓存行相关联的状态指示符。 状态指示符可以包括:第一指示符,用于指示计算机指令是否是从后续转换窗口移动到当前转换窗口的加载指令;第二指示符,用于指示高速缓存行是否在缓存中被修改,响应于执行 计算机指令,第三指示符,用于指示高速缓存行是否响应于计算机指令的执行在高速缓存中被推测地修改;第四指示符,用于指示高速缓存行是否被计算机指令推测性加载;第五指示符,用于指示 执行计算机指令的核心是否独占拥有高速缓存行,以及指示高速缓存行是否无效的第六指示符。

    System of improved loop detection and execution
    4.
    发明授权
    System of improved loop detection and execution 有权
    改进环路检测和执行系统

    公开(公告)号:US09459871B2

    公开(公告)日:2016-10-04

    申请号:US13731377

    申请日:2012-12-31

    CPC classification number: G06F9/30065 G06F9/325 G06F9/381 G06F9/3844

    Abstract: A method, system, and computer program product for identifying loop information corresponding to a plurality of loop instructions. The loop instructions are stored into a queue. The loop instructions are replayed from the queue for execution. Loop iteration is counted based on the identified loop information. A determination is made of whether the last iteration of the loop is done. If the last iteration is not done, then embodiments continue replaying the loop instructions, until the last iteration is done.

    Abstract translation: 一种用于识别对应于多个循环指令的循环信息的方法,系统和计算机程序产品。 循环指令存储到队列中。 循环指令从队列中重播以供执行。 循环迭代根据识别的循环信息进行计数。 确定循环的最后一次迭代是否完成。 如果最后一次迭代未完成,则实施例继续重播循环指令,直到完成最后一次迭代。

Patent Agency Ranking