USE OF A SINGLE INSTRUCTION SET ARCHITECTURE (ISA) INSTRUCTION FOR VECTOR NORMALIZATION

    公开(公告)号:US20210149635A1

    公开(公告)日:2021-05-20

    申请号:US16685561

    申请日:2019-11-15

    Abstract: Embodiments described herein are generally directed to an improved vector normalization instruction. An embodiment of a method includes responsive to receipt by a GPU of a single instruction specifying a vector normalization operation to be performed on V vectors: (i) generating V squared length values, N at a time, by a first processing unit, by, for each N sets of inputs, each representing multiple component vectors for N of the vectors, performing N parallel dot product operations on the N sets of inputs. Generating V sets of outputs representing multiple normalized component vectors of the V vectors, N at a time, by a second processing unit, by, for each N squared length values of the V squared length values, performing N parallel operations on the N squared length values, wherein each of the N parallel operations implement a combination of a reciprocal square root function and a vector scaling function.

    BANKED MEMORY ACCESS EFFICIENCY BY A GRAPHICS PROCESSOR
    69.
    发明申请
    BANKED MEMORY ACCESS EFFICIENCY BY A GRAPHICS PROCESSOR 有权
    图形处理器的银行记忆访问效率

    公开(公告)号:US20150294435A1

    公开(公告)日:2015-10-15

    申请号:US14249154

    申请日:2014-04-09

    Abstract: Conversion of an array of structures (AOS) to a structure of arrays (SOA) improves the efficiency of transfer from the AOS to the SOA. A similar technique can be used to convert efficiently from an SOA to an AOS. The controller performing the conversion computes a partition size as the highest common factor between the structure size of structures in AOS and the number of banks in a first memory device, and transfers data based on the partition size, rather than on the structure size. The controller can read a partition size number of elements from multiple different structures to ensure that full data transfer bandwidth is used for each transfer.

    Abstract translation: 将结构数组(AOS)转换为数组结构(SOA)可提高从AOS到SOA的传输效率。 类似的技术可以用来从SOA有效地转换为AOS。 执行转换的控制器计算分区大小作为AOS中的结构的结构尺寸与第一存储器件中的存储体的数量之间的最高共同因子,并且基于分区大小而不是结构大小来传送数据。 控制器可以从多个不同结构读取分区大小的元素数量,以确保每次传输都使用完整的数据传输带宽。

Patent Agency Ranking