Limited range vector memory access instructions, processors, methods, and systems
    1.
    发明专利
    Limited range vector memory access instructions, processors, methods, and systems 有权
    有限的范围内存访问指令,处理程序,方法和系统

    公开(公告)号:JP2014182807A

    公开(公告)日:2014-09-29

    申请号:JP2014042958

    申请日:2014-03-05

    Abstract: PROBLEM TO BE SOLVED: To access memory locations in only a limited range of a memory in response to a vector memory access instruction.SOLUTION: A processor 100 includes a plurality of packed data registers 107, and also includes execution logic 109 coupled with the packed data registers. The execution logic is operable in response to a limited range vector memory access instruction 103, indicating a source packed memory index having a plurality of packed memory indices selected from 8-bit memory indices and 16-bit memory indices. The execution logic is operable to access memory locations in only a limited range of a memory in response to the limited range vector memory access instruction.

    Abstract translation: 要解决的问题:响应于向量存储器访问指令,仅在存储器的有限范围内访问存储器位置。解决方案:处理器100包括多个打包数据寄存器107,并且还包括执行逻辑109与打包的 数据寄存器 执行逻辑可响应于有限范围向量存储器访问指令103而操作,指示源打包存储器索引具有从8位存储器索引和16位存储器索引中选择的多个打包存储器索引。 执行逻辑可操作以响应于有限范围向量存储器访问指令在存储器的有限范围内访问存储器位置。

    Apparatus and method of efficient vector roll operation
    2.
    发明专利
    Apparatus and method of efficient vector roll operation 有权
    高效矢量滚筒操作的装置和方法

    公开(公告)号:JP2014130573A

    公开(公告)日:2014-07-10

    申请号:JP2013238546

    申请日:2013-11-19

    CPC classification number: G06F9/30036 G06F9/30032

    Abstract: PROBLEM TO BE SOLVED: To provide efficient vector roll operation.SOLUTION: A resultant rolled version of an input vector is created by: forming a first intermediate vector by barrel-rolling elements of the input vector along a first of two lanes defined by an upper half and a lower half of the input vector; forming a second intermediate vector by barrel-rolling elements of the input vector along a second of the two lanes; and forming the resultant rolled version of the input vector by incorporating upper portions of one of the intermediate vector's upper and lower halves as upper portions of the resultant's upper and lower halves and incorporating lower portions of the other intermediate vector's upper and lower halves as lower portions of the resultant's upper and lower halves.

    Abstract translation: 要解决的问题:提供有效的向量滚动操作。解决方案:通过以下步骤创建输入向量的合成滚动版本:通过由上部的两个通道限定的两个通道中的第一个通道的输入向量的滚动滚动元件形成第一中间向量 输入向量的一半和下半部分; 通过所述输入向量的滚动滚动元件沿着所述两条车道中的第二条形成第二中间向量; 并且通过将中间矢量的上半部和下半部中的一个的上部作为结果的上下半部的上部并入并入另一个中间矢量的上半部和下半部的下部作为下部,形成输入矢量的合成轧制版本 的结果的上半部分和下半部分。

    Systems, apparatuses, and methods for determining trailing least significant masking bit of writemask register
    3.
    发明专利
    Systems, apparatuses, and methods for determining trailing least significant masking bit of writemask register 有权
    系统,装置和方法,用于确定写入最小重要屏蔽位的写入寄存器

    公开(公告)号:JP2014182796A

    公开(公告)日:2014-09-29

    申请号:JP2014028431

    申请日:2014-02-18

    CPC classification number: G06F9/30152 G06F9/30018 G06F9/30036

    Abstract: PROBLEM TO BE SOLVED: To provide common operation means which in general makes it possible to adjust mask bits within writemask registers that correspond to elements in a vector register referred to in a SIMD operation instruction.SOLUTION: The execution of a KZBTZ detects a trailing least significant zero bit position in a first input mask and sets an output mask to have values of the first input mask, but with all bit positions closer to the most significant bit position than the trailing least significant zero bit position in a first input mask set to zero. In some embodiments, a second input mask is used as a writemask such that bit positions of the first input mask are not considered in the trailing least significant zero bit position calculation depending upon a corresponding bit position in the second input mask.

    Abstract translation: 要解决的问题:提供通用操作装置,其通常使得可以调整写入掩码寄存器中对应于在SIMD操作指令中引用的向量寄存器中的元素的掩码位。解决方案:KZBTZ的执行检测到最小值 在第一输入掩码中显着的零位位置并且将输出掩模设置为具有第一输入掩码的值,但是与所设置的第一输入掩码中的尾随最低有效零位位置相比,所有位位置接近最高有效位位置 零。 在一些实施例中,使用第二输入掩码作为写掩码,使得根据第二输入掩码中的相应位位置,在尾随最低有效零位位置计算中不考虑第一输入掩码的位位置。

    Systems, apparatuses, and methods for reducing number of short integer multiplications
    6.
    发明专利
    Systems, apparatuses, and methods for reducing number of short integer multiplications 有权
    用于减少短整数多项式的系统,装置和方法

    公开(公告)号:JP2014182811A

    公开(公告)日:2014-09-29

    申请号:JP2014043808

    申请日:2014-03-06

    CPC classification number: G06F9/30145 G06F9/3001 G06F9/30036 G06F9/3895

    Abstract: PROBLEM TO BE SOLVED: To provide: embodiments of an instruction generically called square-multiply (SQRMUL) instruction; and systems, architectures and instruction formats for use to improve latency.SOLUTION: Two source registers 101 and 103 hold values A and B, respectively. These values are processed by execution logic 107 to produce A, A*B, and B. These results are stored in a destination register 105. This register may be a general-purpose register such as a doubleword sized register, or a packed-data register with data element positions dedicated to storing calculated values.

    Abstract translation: 要解决的问题:提供:一般称为平方乘法(SQRMUL)指令的指令的实施例; 以及用于改善延迟的系统,架构和指令格式。解决方案:两个源寄存器101和103分别保存值A和B. 这些值由执行逻辑107处理以产生A,A * B和B.这些结果存储在目标寄存器105中。该寄存器可以是通用寄存器,例如双字大小的寄存器或打包数据 注册专用于存储计算值的数据元素位置。

    Systems, apparatuses, and methods for zeroing of bits in data element
    7.
    发明专利
    Systems, apparatuses, and methods for zeroing of bits in data element 有权
    用于数据单元中零位的系统,设备和方法

    公开(公告)号:JP2014182800A

    公开(公告)日:2014-09-29

    申请号:JP2014032531

    申请日:2014-02-24

    Abstract: PROBLEM TO BE SOLVED: To provide systems, methods and apparatuses for execution of an instruction that uses a control vector to zero out bits starting at a specific position in each data element of a source in a SIMD processing system.SOLUTION: The execution of a VPBZHI causes, on a per data element basis of a second source, a zeroing of bits higher (more significant) than a starting point in the data element. The starting point is defined by the contents of a data element in a first source. The resultant data elements are stored in a corresponding data element position of a destination.

    Abstract translation: 要解决的问题:提供用于执行使用控制向量的指令的系统,方法和装置,用于将从SIMD处理系统中的源的每个数据元素中的特定位置开始的位清零。解决方案:执行 VPBZHI在第二个源的每个数据元素的基础上导致比数据元素中的起始点更高(更重要)的位的归零。 起始点由第一个数据元素的内容定义。 所得数据元素存储在目的地的相应数据元素位置。

    Instructions and logic to vectorize conditional loops
    8.
    发明专利
    Instructions and logic to vectorize conditional loops 有权
    说明和逻辑来展示条件

    公开(公告)号:JP2014130580A

    公开(公告)日:2014-07-10

    申请号:JP2013254939

    申请日:2013-12-10

    Abstract: PROBLEM TO BE SOLVED: To provide instructions and logic that provide vectorization of conditional loops.SOLUTION: A vector expand instruction has a parameter to specify a source vector, a parameter to specify a conditions mask register, and a destination parameter to specify a destination register to hold n consecutive vector elements. Each of the plurality of n consecutive vector elements has an equal variable partition size of m bytes. In response to the processor instruction, data is copied from consecutive vector elements in the source vector, and copied to unmasked vector elements of the specified destination vector, where n varies according to the processor instruction executed.

    Abstract translation: 要解决的问题:提供提供条件循环向量化的指令和逻辑。解决方案:向量展开指令具有指定源向量的参数,用于指定条件掩码寄存器的参数和用于指定目标寄存器的目标参数 保持n个连续的矢量元素。 多个n个连续向量元素中的每一个具有m个字节的相等的可变分区大小。 响应于处理器指令,从源向量中的连续矢量元素复制数据,并将其复制到指定目标向量的未屏蔽向量元素,其中n根据执行的处理器指令而变化。

Patent Agency Ranking