Method and apparatus of copying data to remote memory

    公开(公告)号:US10395424B2

    公开(公告)日:2019-08-27

    申请号:US15389204

    申请日:2016-12-22

    Abstract: A method and apparatus of copying data from a first memory location to a second memory location includes performing a copy operation selected out of one or more copy operations. The copy operations include performing interleaved data copying, performing a full wavefront copy operation, copying all data to a local data store (LDS) prior to copying to the second memory location, or pipelining the data for copying. The copy operation is applied to copy the data from the first location to the second memory location.

    Primitive culling using automatically compiled compute shaders

    公开(公告)号:US10102662B2

    公开(公告)日:2018-10-16

    申请号:US15221431

    申请日:2016-07-27

    Abstract: Techniques for culling primitives are provided herein. The techniques involve automatic generation of shader programs to be executed by an accelerated processing device. A just-in-time compiler automatically generates the shader programs based on a vertex shader program that is provided for use in the vertex shader stage of the graphics processing pipeline. The automatically generated shader programs include instructions from the vertex shader program that transform the positions of vertices provided as input to the graphics processing pipeline to generate transformed input vertices. The shader programs also include instructions to cull primitives based on the transformed input vertices. After generating the automatically generated shader programs, the software module transmits the automatically generated shader programs to the graphics processing pipeline for execution. After culling primitives, the automatically generated shader programs output culled primitives to the remainder of the graphics processing pipeline.

    METHOD AND APPARATUS OF COPYING DATA TO REMOTE MEMORY

    公开(公告)号:US20180181306A1

    公开(公告)日:2018-06-28

    申请号:US15389204

    申请日:2016-12-22

    CPC classification number: G06T1/60 G06F9/46 G06T15/005 G06T17/20

    Abstract: A method and apparatus of copying data from a first memory location to a second memory location includes performing a copy operation selected out of one or more copy operations. The copy operations include performing interleaved data copying, performing a full wavefront copy operation, copying all data to a local data store (LDS) prior to copying to the second memory location, or pipelining the data for copying. The copy operation is applied to copy the data from the first location to the second memory location.

    Run-time memory access uniformity checking

    公开(公告)号:US10346055B2

    公开(公告)日:2019-07-09

    申请号:US15663103

    申请日:2017-07-28

    Inventor: Guohua Jin

    Abstract: Systems, apparatuses, and methods for performing run-time checking of access uniformity of vector memory access instructions are disclosed. A system includes a vector unit, a scalar unit, and a memory. The system performs a run-time check to determine if two or more threads of a wave have access uniformity to the memory prior to executing a vector memory access instruction for the wave on the vector unit. The system replaces the vector memory access instruction with a group of instructions responsive to determining that two or more threads of the wave have access uniformity to the memory. The group of instructions includes a scalar access instruction to memory followed by a cross-thread data sharing instruction. The scalar access instruction is executed on the scalar unit. Alternatively, the group of instructions can include a vector memory access instruction by only a single thread in each group having access uniformity.

    METHOD AND APPARATUS OF CROSS SHADER COMPILATION

    公开(公告)号:US20190164337A1

    公开(公告)日:2019-05-30

    申请号:US15827909

    申请日:2017-11-30

    Abstract: A method and apparatus provides for compiling a plurality of shaders, each shader having a plurality of computer-readable statements, into a plurality of computer-executable instructions. In one example, the method and apparatus, using a computing device, receives the plurality of shaders used in a process pipeline for performing at least one shading function, determines a shader type of each of the plurality of shaders based on the at least one shading function, and compiles the plurality of shaders by generating the computer-executable instructions using data including a shader descriptor for each of the plurality of shaders, resulting in the shading functions of the plurality of shaders combined together.

    Method and apparatus of cross shader compilation

    公开(公告)号:US11080927B2

    公开(公告)日:2021-08-03

    申请号:US15827909

    申请日:2017-11-30

    Abstract: A method and apparatus provides for compiling a plurality of shaders, each shader having a plurality of computer-readable statements, into a plurality of computer-executable instructions. In one example, the method and apparatus, using a computing device, receives the plurality of shaders used in a process pipeline for performing at least one shading function, determines a shader type of each of the plurality of shaders based on the at least one shading function, and compiles the plurality of shaders by generating the computer-executable instructions using data including a shader descriptor for each of the plurality of shaders, resulting in the shading functions of the plurality of shaders combined together.

    RUN-TIME MEMORY ACCESS UNIFORMITY CHECKING
    7.
    发明申请

    公开(公告)号:US20190034093A1

    公开(公告)日:2019-01-31

    申请号:US15663103

    申请日:2017-07-28

    Inventor: Guohua Jin

    CPC classification number: G06F3/0611 G06F9/5094 G06F13/1631 G06F13/1663

    Abstract: Systems, apparatuses, and methods for performing run-time checking of access uniformity of vector memory access instructions are disclosed. A system includes a vector unit, a scalar unit, and a memory. The system performs a run-time check to determine if two or more threads of a wave have access uniformity to the memory prior to executing a vector memory access instruction for the wave on the vector unit. The system replaces the vector memory access instruction with a group of instructions responsive to determining that two or more threads of the wave have access uniformity to the memory. The group of instructions includes a scalar access instruction to memory followed by a cross-thread data sharing instruction. The scalar access instruction is executed on the scalar unit. Alternatively, the group of instructions can include a vector memory access instruction by only a single thread in each group having access uniformity.

    PRIMITIVE CULLING USING AUTOMATICALLY COMPILED COMPUTE SHADERS

    公开(公告)号:US20180033184A1

    公开(公告)日:2018-02-01

    申请号:US15221431

    申请日:2016-07-27

    CPC classification number: G06T15/005 G06T15/40 G06T15/80

    Abstract: Techniques for culling primitives are provided herein. The techniques involve automatic generation of shader programs to be executed by an accelerated processing device. A just-in-time compiler automatically generates the shader programs based on a vertex shader program that is provided for use in the vertex shader stage of the graphics processing pipeline. The automatically generated shader programs include instructions from the vertex shader program that transform the positions of vertices provided as input to the graphics processing pipeline to generate transformed input vertices. The shader programs also include instructions to cull primitives based on the transformed input vertices. After generating the automatically generated shader programs, the software module transmits the automatically generated shader programs to the graphics processing pipeline for execution. After culling primitives, the automatically generated shader programs output culled primitives to the remainder of the graphics processing pipeline.

Patent Agency Ranking