Patent search ap:("INTEL CORPORATION") AND inv:"Supratim Pal" Page 7

61.

发明授权
Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format 有权

公开(公告)号：US11361496B2

公开(公告)日：2022-06-14

申请号：US17304092

申请日：2021-06-14

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Shubra Marwaha , Ashutosh Garg , Supratim Pal , Jorge Parra , Chandra Gurram , Varghese George , Darin Starkey , Guei-Yuan Lueh

IPC: G06T15/06 , G06F9/30 , G06F9/38 , G06F17/18

Abstract: Described herein is a graphics processing unit (GPU) comprising a single instruction, multiple thread (SIMT) multiprocessor comprising an instruction cache, a shared memory coupled with the instruction cache, and circuitry coupled with the shared memory and the instruction cache, the circuitry including multiple texture units, a first core including hardware to accelerate matrix operations, and a second core configured to receive an instruction having multiple operands in a bfloat16 (BF16) number format, wherein the multiple operands include a first source operand, a second source operand, and a third source operand, and the BF16 number format is a sixteen-bit floating point format having an eight-bit exponent and process the instruction, wherein to process the instruction includes to multiply the second source operand by the third source operand and add a first source operand to a result of the multiply.

62.

发明申请
CONTROL FLOW MECHANISM FOR EXECUTION OF GRAPHICS PROCESSOR INSTRUCTIONS USING ACTIVE CHANNEL PACKING 有权

公开(公告)号：US20210286626A1

公开(公告)日：2021-09-16

申请号：US17213453

申请日：2021-03-26

Applicant: Intel Corporation

Inventor： Subramaniam M. Maiyuran , Guei-Yuan Lueh , Supratim Pal , Gang Chen , Ananda V. Kommaraju , Joy Chandra , Altug Koker , Prasoonkumar Surti , David Puffer , Hong Bin Liao , Joydeep Ray , Abhishek R. Appu , Ankur N. Shah , Travis T. Schluessler , Jonathan Kennedy , Devan Burke

IPC: G06F9/38 , G06T1/20 , G06F9/30

Abstract: An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.

63.

发明申请
USE OF A SINGLE INSTRUCTION SET ARCHITECTURE (ISA) INSTRUCTION FOR VECTOR NORMALIZATION 有权

公开(公告)号：US20210149635A1

公开(公告)日：2021-05-20

申请号：US16685561

申请日：2019-11-15

Applicant: Intel Corporation

Inventor： Abhishek Rhisheekesan , Supratim Pal , Shashank Lakshminarayana , Subramaniam Maiyuran

IPC: G06F7/552 , G06F9/30

Abstract: Embodiments described herein are generally directed to an improved vector normalization instruction. An embodiment of a method includes responsive to receipt by a GPU of a single instruction specifying a vector normalization operation to be performed on V vectors: (i) generating V squared length values, N at a time, by a first processing unit, by, for each N sets of inputs, each representing multiple component vectors for N of the vectors, performing N parallel dot product operations on the N sets of inputs. Generating V sets of outputs representing multiple normalized component vectors of the V vectors, N at a time, by a second processing unit, by, for each N squared length values of the V squared length values, performing N parallel operations on the N squared length values, wherein each of the N parallel operations implement a combination of a reciprocal square root function and a vector scaling function.

64.

发明授权
Dynamic thread splitting having multiple instruction pointers for the same thread 有权

公开(公告)号：US10789071B2

公开(公告)日：2020-09-29

申请号：US14794521

申请日：2015-07-08

Applicant: Intel Corporation

Inventor： Hema C. Nalluri , Supratim Pal , Subramaniam Maiyuran , Joy Chandra

IPC: G06F9/30 , G06F9/38 , G06T1/20

Abstract: Systems, apparatuses and methods may provide for associating a first instruction pointer with an IF block of a primary IF-ELSE conditional construct associated with a thread and activating a second instruction pointer in response to a dependency associated with the IF block. Additionally, the second instruction pointer may be associated with an ELSE block of the primary IF-ELSE conditional construct. In one example, the IF block and the ELSE block are executed, via the first instruction pointer and the second instruction pointer, one or more of independently from or parallel to one another.

65.

发明授权
Recompiling GPU code based on spill/fill instructions and number of stall cycles 有权

公开(公告)号：US10698689B2

公开(公告)日：2020-06-30

申请号：US16120226

申请日：2018-09-01

Applicant: Intel Corporation

Inventor： Pratik J. Ashar , Supratim Pal , Subramaniam Maiyuran , Wei-Yu Chen , Guei-Yuan Lueh

IPC: G06F8/41 , G06F9/38 , G06F9/30 , G06F9/50

Abstract: An apparatus to facilitate register sharing is disclosed. The apparatus includes one or more processors to generate first machine code having a first General Purpose Register (GRF) per thread ratio, detect an occurrence of one or more spill/fill instructions in the first machine code, and generate second machine code having a second GRF per thread ratio upon a detection of one or more spill/fill instructions in the first machine code, wherein the second GRF per thread ratio is based on a disabling of a first of a plurality of hardware threads.

66.

发明授权
Software scoreboard information and synchronization 有权

公开(公告)号：US10360654B1

公开(公告)日：2019-07-23

申请号：US15990328

申请日：2018-05-25

Applicant: Intel Corporation

Inventor： Subramaniam Maiyuran , Supratim Pal , Jorge E. Parra , Chandra S. Gurram , Ashwin J. Shivani , Ashutosh Garg , Brent A. Schwartz , Jorge F. Garcia Pabon , Darin M. Starkey , Shubh B. Shah , Guei-Yuan Lueh , Kaiyu Chen , Konrad Trifunovic , Buqi Cheng , Weiyu Chen

IPC: G06F9/38 , G06F8/41 , G06T1/20 , G06F9/30 , G06T1/60 , G09G5/36 , G06T15/00

Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware. The compiler can then insert a software scoreboard sync immediate instruction into compiled program code to manage instruction dependencies and prevent data hazards from occurring.

67.

发明授权
Source operand read suppression for graphics processors 有权

公开(公告)号：US10152452B2

公开(公告)日：2018-12-11

申请号：US14726349

申请日：2015-05-29

Applicant: Intel Corporation

Inventor： Supratim Pal , Subramaniam Maiyuran , Mark C. Davis

IPC: G06F15/82 , G06F9/30 , G06F9/345 , G06F9/38

Abstract: Techniques to suppress redundant reads to register addresses and to replicate read data are disclosed. The redundant reads are suppressed when multiple source operands specify the same register address to read. Additionally, the read data is replicated to a data stream or data location corresponding to the source operands where the data read was suppressed.

68.

发明授权
Banked memory access efficiency by a graphics processor 有权

公开(公告)号：US09632801B2

公开(公告)日：2017-04-25

申请号：US14249154

申请日：2014-04-09

Applicant: Intel Corporation

Inventor： Supratim Pal , Murali Sundaresan

IPC: G06T1/60 , G06F9/445 , G06F12/08 , G06F12/084

CPC classification number: G06F9/445 , G06F12/0207 , G06F12/0607 , G06F12/08 , G06F12/0811 , G06F12/084 , G06F12/0851 , G06F12/0893 , Y02D10/13

Abstract: Conversion of an array of structures (AOS) to a structure of arrays (SOA) improves the efficiency of transfer from the AOS to the SOA. A similar technique can be used to convert efficiently from an SOA to an AOS. The controller performing the conversion computes a partition size as the highest common factor between the structure size of structures in AOS and the number of banks in a first memory device, and transfers data based on the partition size, rather than on the structure size. The controller can read a partition size number of elements from multiple different structures to ensure that full data transfer bandwidth is used for each transfer.

69.

发明申请
BANKED MEMORY ACCESS EFFICIENCY BY A GRAPHICS PROCESSOR 有权
Title translation: 图形处理器的银行记忆访问效率

公开(公告)号：US20150294435A1

公开(公告)日：2015-10-15

申请号：US14249154

申请日：2014-04-09

Applicant: Intel Corporation

Inventor： Supratim Pal , Murali Sundaresan

IPC: G06T1/60 , G06T1/20

CPC classification number: G06F9/445 , G06F12/0207 , G06F12/0607 , G06F12/08 , G06F12/0811 , G06F12/084 , G06F12/0851 , G06F12/0893 , Y02D10/13

Abstract: Conversion of an array of structures (AOS) to a structure of arrays (SOA) improves the efficiency of transfer from the AOS to the SOA. A similar technique can be used to convert efficiently from an SOA to an AOS. The controller performing the conversion computes a partition size as the highest common factor between the structure size of structures in AOS and the number of banks in a first memory device, and transfers data based on the partition size, rather than on the structure size. The controller can read a partition size number of elements from multiple different structures to ensure that full data transfer bandwidth is used for each transfer.

Abstract translation: 将结构数组（AOS）转换为数组结构（SOA）可提高从AOS到SOA的传输效率。类似的技术可以用来从SOA有效地转换为AOS。执行转换的控制器计算分区大小作为AOS中的结构的结构尺寸与第一存储器件中的存储体的数量之间的最高共同因子，并且基于分区大小而不是结构大小来传送数据。控制器可以从多个不同结构读取分区大小的元素数量，以确保每次传输都使用完整的数据传输带宽。

70.

发明申请
MULTIPLE REGISTER ALLOCATION SIZES FOR GPU HARDWARE THREADS 有权

公开(公告)号：US20250147762A1

公开(公告)日：2025-05-08

申请号：US18504407

申请日：2023-11-08

Applicant: Intel Corporation

Inventor： Vasanth Ranganathan , Gang Chen , Supratim Pal , Jorge Eduardo Parra Osorio , Arthur Hunter , Boris Kuznetsov , Deepak N K , Siva Kumar Seemakurthi , James Valerio , Shubham Dinesh Chavan , Abhishek Kumar Singh , Samir Pandya , Sandeep Tippannanavar Niranjan , Alan Curtis , Jain Philip , Maltesh Kulkarni , Fangwen Fu , John Wiegert , Brent Schwartz

IPC: G06F9/30 , G06T15/00

Abstract: Described herein is a graphics processor having processing resources with configurable thread and register configurations. Program code can configure a number of registers and accumulators that will be used by hardware threads during execution of the program code by the graphics processor. Processing resources within the graphics processor can be configured to assign different numbers of registers and accumulators to hardware threads based on the configuration requested by program code to be executed by the processing resource.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification