PERFORMING POWER MANAGEMENT IN A MULTICORE PROCESSOR
    5.
    发明申请
    PERFORMING POWER MANAGEMENT IN A MULTICORE PROCESSOR 审中-公开
    在多处理器中执行电源管理

    公开(公告)号:US20160239074A1

    公开(公告)日:2016-08-18

    申请号:US14621709

    申请日:2015-02-13

    Abstract: In an embodiment, a processor includes: a plurality of first cores to independently execute instructions, each of the plurality of first cores including a plurality of counters to store performance information; at least one second core to perform memory operations; and a power controller to receive performance information from at least some of the plurality of counters, determine a workload type executed on the processor based at least in part on the performance information, and based on the workload type dynamically migrate one or more threads from one or more of the plurality of first cores to the at least one second core for execution during a next operation interval. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,处理器包括:多个第一核,用于独立地执行指令,所述多个第一核中的每一个包括存储执行信息的多个计数器; 用于执行存储器操作的至少一个第二核心; 以及功率控制器,用于从所述多个计数器中的至少一些计数器接收性能信息,至少部分地基于所述性能信息确定在所述处理器上执行的工作负载类型,并且基于所述工作负载类型,动态地从一个或多个计算机迁移一个或多个线程 或多个第一核心到至少一个第二核心,以在下一个操作间隔期间执行。 描述和要求保护其他实施例。

    Speculative non-faulting loads and gathers
    6.
    发明授权
    Speculative non-faulting loads and gathers 有权
    投机无故障负载和收集

    公开(公告)号:US09189236B2

    公开(公告)日:2015-11-17

    申请号:US13725907

    申请日:2012-12-21

    Abstract: According to one embodiment, a processor includes an instruction decoder to decode an instruction to read a plurality of data elements from memory, the instruction having a first operand specifying a storage location, a second operand specifying a bitmask having one or more bits, each bit corresponding to one of the data elements, and a third operand specifying a memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the instruction, to read one or more data elements speculatively, based on the bitmask specified by the second operand, from a memory location based on the memory address indicated by the third operand, and to store the one or more data elements in the storage location indicated by the first operand.

    Abstract translation: 根据一个实施例,处理器包括指令解码器,用于解码从存储器读取多个数据元素的指令,该指令具有指定存储位置的第一操作数,指定具有一个或多个位的位掩码的第二操作数,每个位 对应于数据元素之一,以及指定存储多个数据元素的存储器地址的第三操作数。 所述处理器还包括执行单元,响应于所述指令,所述执行单元基于所述第二操作数指定的位掩码,从存储器位置推测性地读取一个或多个数据元素,所述执行单元基于由所述存储器地址 并且将一个或多个数据元素存储在由第一操作数指示的存储位置中。

    APPARATUS AND METHOD FOR DETECTING IDENTICAL ELEMENTS WITHIN A VECTOR REGISTER
    7.
    发明申请
    APPARATUS AND METHOD FOR DETECTING IDENTICAL ELEMENTS WITHIN A VECTOR REGISTER 审中-公开
    用于检测矢量寄存器中的标识元素的装置和方法

    公开(公告)号:US20140089634A1

    公开(公告)日:2014-03-27

    申请号:US13995490

    申请日:2011-12-23

    Abstract: An apparatus, system and method are described for identifying identical elements in a vector register. For example, a computer implemented method according to one embodiment comprises the operations of: reading each active element from a first vector register, each active element having a defined bit position within the first vector register; reading each element from a second vector register, each element having a defined bit position within the second vector register corresponding to a bit position of a current active element in the first vector register; reading an input mask register, the input mask register identifying active bit positions in the second vector register for which comparisons are to be made with values in the first vector register, the comparison operations comprising: comparing each active element in the second vector register with elements in the first vector register having bit positions preceding the bit position of the current active element in the second vector register; and setting a bit position in an output mask register equal to a true value if all of the preceding bit positions in the first vector register are equal to the bit in the current active bit position in the second vector register.

    Abstract translation: 描述了用于识别向量寄存器中的相同元件的装置,系统和方法。 例如,根据一个实施例的计算机实现的方法包括以下操作:从第一向量寄存器读取每个活动元件,每个有源元件在第一向量寄存器内具有定义的位位置; 从第二向量寄存器读取每个元素,每个元素在第二向量寄存器内具有对应于第一向量寄存器中当前有效元素的位位置的定义的位位置; 读取输入掩码寄存器,所述输入掩码寄存器识别所述第二向量寄存器中的活动位位置,用于与所述第一向量寄存器中的值进行比较,所述比较操作包括:将所述第二向量寄存器中的每个有效元素与元素 在第一矢量寄存器中,位于第二向量寄存器中当前有效元件的位位置之前的位位置; 并且如果第一向量寄存器中的所有先前位位置等于第二向量寄存器中的当前活动位位置中的位,则将输出屏蔽寄存器中的位位置设置为等于真值。

    System and method for memory bandwidth friendly sorting on multi-core architectures
    8.
    发明授权
    System and method for memory bandwidth friendly sorting on multi-core architectures 有权
    多核架构内存带宽友好排序的系统和方法

    公开(公告)号:US08463820B2

    公开(公告)日:2013-06-11

    申请号:US12454883

    申请日:2009-05-26

    CPC classification number: G06F7/36 G06F12/0802

    Abstract: In some embodiments, the invention involves utilizing a tree merge sort in a platform to minimize cache reads/writes when sorting large amounts of data. An embodiment uses blocks of pre-sorted data residing in “leaf nodes” residing in memory storage. A pre-sorted block of data from each leaf node is read from memory and stored in faster cache memory. A tree merge sort is performed on the nodes that are cache resident until a block of data migrates to a root node. Sorted blocks reaching the root node are written to memory storage in an output list until all pre-sorted data blocks have been moved to cache and merged upward to the root. The completed output list in memory storage is a list of the fully sorted data. Other embodiments are described and claimed.

    Abstract translation: 在一些实施例中,本发明涉及在平台中利用树合并排序以在排序大量数据时最小化高速缓存读/写。 实施例使用驻留在存储器存储器中的“叶节点”中驻留的预排序数据块。 来自每个叶节点的预先排序的数据块从存储器读取并存储在更快的高速缓冲存储器中。 在缓存驻留的节点上执行树合并排序,直到数据块迁移到根节点。 到达根节点的排序块被写入到输出列表中的存储器存储器中,直到所有预排序的数据块已被移动到高速缓存并向上合并到根。 内存存储器中完成的输出列表是完整排序数据的列表。 描述和要求保护其他实施例。

Patent Agency Ranking