Patent search ap:("Intel Corporation") AND inv:"Joydeep Ray" Page 11

101.

发明授权
Apparatus and method for efficient graphics virtualization 有权

公开(公告)号：US10891773B2

公开(公告)日：2021-01-12

申请号：US15482677

申请日：2017-04-07

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Pattabhiraman K , Balaji Vembu , Altug Koker , Niranjan L. Cooray , Josh B. Mastronarde

IPC: G06T15/00 , G06F9/455 , G06T1/60 , G09G5/36 , G09G5/00 , G09G5/393 , G06F9/48 , G06F9/50 , G06T15/04 , G06T15/80 , G06T17/10 , G06T17/20

Abstract: An apparatus and method are described for allocating local memories to virtual machines. For example, one embodiment of an apparatus comprises: a command streamer to queue commands from a plurality of virtual machines (VMs) or applications, the commands to be distributed from the command streamer and executed by graphics processing resources of a graphics processing unit (GPU); a tile cache to store graphics data associated with the plurality of VMs or applications as the commands are executed by the graphics processing resources; and tile cache allocation hardware logic to allocate a first portion of the tile cache to a first VM or application and a second portion of the tile cache to a second VM or application; the tile cache allocation hardware logic to further allocate a first region in system memory to store spill-over data when the first portion of the tile cache and/or the second portion of the file cache becomes full.

102.

发明授权
Guaranteed forward progress mechanism 有权

公开(公告)号：US10860468B2

公开(公告)日：2020-12-08

申请号：US16379137

申请日：2019-04-09

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Niranjan L. Cooray , Abhishek R. Appu

IPC: G06F12/00 , G06F12/0875 , G06F9/54 , G06T1/60 , G06F12/0811

Abstract: An apparatus to facilitate guaranteed forward progress for graphics data is disclosed. The apparatus includes a plurality of ports to receive and transmit streams of graphics data, one or more buffers associated with each of the plurality of ports to store the graphics data and switching logic to virtually partition each of the one or more buffers to allocate a dedicated buffer to receive each of a plurality of independent streams of graphics data.

103.

发明授权
Compute optimizations for low precision machine learning operations 有权

公开(公告)号：US10853906B2

公开(公告)日：2020-12-01

申请号：US16197821

申请日：2018-11-21

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , Altug Koker , Abhishek R. Appu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Ben J. Ashbaugh , Barath Lakshmanan , Liwei Ma , Joydeep Ray , Ping T. Tang , Michael S. Strickland

IPC: G06T1/20 , G06F7/483 , G06N3/08 , G06F9/30 , G06N3/04 , G06N3/063 , G06F9/50 , G06F9/38 , G06N20/00 , G06F3/14 , G06T1/60 , G06T15/00

Abstract: One embodiment provides an accelerator module comprising a memory stack including multiple memory dies; a graphics processing unit (GPU) coupled with the memory stack via one or more memory controllers, the GPU including a plurality of multiprocessors having a single instruction, multiple thread (SIMT) architecture, the multiprocessors to execute at least one single instruction. The at least one single instruction is to cause at least a portion of the GPU to perform a floating point operation on input having differing precisions. The floating point operation is a two-dimensional matrix multiply and accumulate operation.

104.

发明授权
Power savings for neural network architecture with zero activations during inference 有权

公开(公告)号：US10817042B2

公开(公告)日：2020-10-27

申请号：US16144538

申请日：2018-09-27

Applicant: Intel Corporation

Inventor： Kinchit Desai , Sanjeev Jahagirdar , Prasoonkumar Surti , Joydeep Ray

IPC: G06F1/3237 , G06N3/04 , G06N3/08 , G06F1/3234 , G06F1/3206

Abstract: Embodiments are generally directed to providing power savings for a neural network architecture with zero activations during inference. An embodiment of an apparatus includes one or more processors including one or more processor cores; and a memory to store data for processing including neural network processing, wherein the apparatus to perform a fast clear operation to initialize activation buffers for a neural network by updating metadata to indicate zero values, the neural network including a plurality of layers, wherein the apparatus is to compare outputs for the neural network to the metadata values and to write an output to memory only if the output is non-zero.

105.

发明申请
ENGINE TO ENABLE HIGH SPEED CONTEXT SWITCHING VIA ON-DIE STORAGE 审中-公开

公开(公告)号：US20200334200A1

公开(公告)日：2020-10-22

申请号：US16869223

申请日：2020-05-07

Applicant: Intel Corporation

Inventor： Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor

IPC: G06F16/13 , G06F9/38 , G06F9/30 , G06F16/11 , G06F16/172 , G06F9/46

Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.

106.

发明申请
THREAD SCHEDULING OVER COMPUTE BLOCKS FOR POWER OPTIMIZATION 审中-公开

公开(公告)号：US20200327635A1

公开(公告)日：2020-10-15

申请号：US16714862

申请日：2019-12-16

Applicant: Intel Corporation

Inventor： Altug Koker , Balaji Vembu , Joydeep Ray , James A. Valerio , Abhishek R. Appu

IPC: G06T1/20 , G06F9/50

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a processing array including multiple compute blocks, each compute block including multiple processing clusters and a thread dispatch unit to dispatch threads of a workload to the multiple compute blocks based on a parallelism metric, wherein the thread dispatch unit, based on the parallelism metric, is to perform one of a first operation and a second operation, the first operation to distribute threads across the multiple compute blocks and the second operation is to concentrate threads within one of the multiple compute blocks.

107.

发明授权
Control surface access using flat memory mapping 有权

公开(公告)号：US10802970B1

公开(公告)日：2020-10-13

申请号：US16366266

申请日：2019-03-27

Applicant: Intel Corporation

Inventor： Niranjan L. Cooray , Altug Koker , Vidhya Krishnan , Ronald W. Silvas , John H. Feit , Prasoonkumar Surti , Joydeep Ray , Abhishek R. Appu

IPC: G06F12/0837 , G06F9/38 , G06F16/907 , H04L9/06 , G06F12/0811

Abstract: Embodiments described herein provide an apparatus comprising a processor to allocate a first memory space for data for a graphics workload, the first memory comprising a first plurality of addressable memory locations, allocate a second memory space for compression metadata relating to the data for the graphics workload, the second memory space comprising a second plurality of addressable memory locations and having an amount of memory corresponding to a predetermined ratio of the amount of memory allocated to first memory space, and configure a direct memory mapping between the first plurality of addressable memory locations and the second plurality of addressable memory locations. Other embodiments may be described and claimed.

108.

发明授权
Sector cache for compression 有权

公开(公告)号：US10783084B2

公开(公告)日：2020-09-22

申请号：US16702073

申请日：2019-12-03

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Atlug Koker , Joydeep Ray , David Puffer , Prasoonkumar Surti , Lakshminarayanan Striramassarma , Vasanth Ranganathan , Kiran C. Veernapu , Balaji Vembu , Pattabhiraman K

IPC: G06F12/0877 , G06F12/0868 , G06F12/0846 , G06F12/0855 , G06F12/0802 , G06F12/0806 , G06F12/0893 , G06F12/126 , G06T1/60

Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.

109.

发明申请
ON CHIP DENSE MEMORY FOR TEMPORAL BUFFERING 审中-公开

公开(公告)号：US20200294182A1

公开(公告)日：2020-09-17

申请号：US16355573

申请日：2019-03-15

Applicant: Intel Corporation

Inventor： Varghese George , Altug Koker , Aravindh Anantaraman , Subramaniam Maiyuran , SungYe Kim , Valentin Andrei , Elmoustapha Ould-Ahmed-Vall , Joydeep Ray , Abhishek R. Appu , Nicolas C. Galoppo von Borries , Prasoonkumar Surti , Mike Macpherson

IPC: G06T1/20 , G06T1/60 , G06N20/00 , G06F16/17

Abstract: Apparatuses including general-purpose graphics processing units having on chip dense memory for temporal buffering are disclosed. In one embodiment, a graphics multiprocessor includes a plurality of compute engines to perform first computations to generate a first set of data, cache for storing data, and a high density memory that is integrated on chip with the plurality of compute engines and the cache. The high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data to the cache during a first time period that is prior to a second time period when the plurality of compute engines will use the first set of data for second computations.

110.

发明申请
SYSTEM AND METHOD TO SUPPORT MULTIPLE WALKERS PER COMMAND 审中-公开

公开(公告)号：US20200286201A1

公开(公告)日：2020-09-10

申请号：US16297129

申请日：2019-03-08

Applicant: Intel Corporation

Inventor： James Valerio , Vasanth Ranganathan , Joydeep Ray , Abhishek R. Appu , Ben J. Ashbaugh , Brandon Fliflet , Jeffery S. Boles , Srinivasan Embar Raghukrishnan , Rahul Kulkarni

IPC: G06T1/20 , G06T1/60

Abstract: Embodiments described herein provide an apparatus comprising a processor to configure a plurality of contexts of a command engine to execute a graphics workload comprising a plurality of walkers, allocate, from a pool of execution units of a graphics processor, a subset of execution units to each walker in the plurality of walkers based at least in part on the predetermined number of walkers configured for the context, for each context in the plurality of contexts, dispatch one or more walkers of the plurality of walkers to the execution units, and upon dispatch of the one or more walkers of the plurality of walkers, write an opcode to a computer-readable memory indicating that the dispatch of the walker is complete, wherein the opcode comprises dependency data for the one or more walkers of the plurality of walkers. Other embodiments may be described and claimed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification