Patent search ap:("INTEL CORPORATION") AND inv:"Rajkishore Barik" Page 2

11.

发明申请
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 有权

公开(公告)号：US20210124579A1

公开(公告)日：2021-04-29

申请号：US17115989

申请日：2020-12-09

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G09G5/393 , G06F9/38 , G06F7/483 , G06F7/544 , G06N3/04 , G06N3/063 , G06N3/08

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

12.

发明申请
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING 有权

公开(公告)号：US20210035255A1

公开(公告)日：2021-02-04

申请号：US16928353

申请日：2020-07-14

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/04 , G06N3/063 , G06F9/38 , G06F9/30 , G06N3/08

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.

13.

发明授权
Compute optimization mechanism for deep neural networks 有权

公开(公告)号：US10902547B2

公开(公告)日：2021-01-26

申请号：US15819093

申请日：2017-11-21

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06N3/04 , G06F9/455 , G06F9/50 , G06N3/063 , G06N3/08 , G06F8/41

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type.

14.

发明授权
Instructions and logic to perform floating-point and integer operations for machine learning 有权

公开(公告)号：US10474458B2

公开(公告)日：2019-11-12

申请号：US15787129

申请日：2017-10-18

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G09G5/00 , G06F9/30 , G09G5/393 , G06F9/38 , G06F7/483 , G06F7/544 , G06N3/04 , G06N3/063 , G06N3/08 , G06T15/00 , G06N20/00

Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.

15.

发明授权
Autonomous machines through cloud, error corrections, and predictions 有权

公开(公告)号：US10410115B2

公开(公告)日：2019-09-10

申请号：US15581133

申请日：2017-04-28

Applicant: Intel Corporation

Inventor： Brian T. Lewis , Feng Chen , Jeffrey R. Jackson , Justin E. Gottschlich , Rajkishore Barik , Xiaoming Chen , Prasoonkumar Surti , Mike B. Macpherson , Murali Sundaresan

IPC: G01C22/00 , G06N3/063 , B60W30/095 , G06N3/00 , G06N3/04 , G01C21/34

Abstract: A mechanism is described for facilitating smart collection of data and smart management of autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and combining a first computation directed to be performed locally at a local computing device with a second computation directed to be performed remotely at a remote computing device in communication with the local computing device over the one or more networks, where the first computation consumes low power, wherein the second computation consumes high power.

16.

发明授权
Machine learning sparse computation mechanism 有权

公开(公告)号：US10346944B2

公开(公告)日：2019-07-09

申请号：US15482791

申请日：2017-04-09

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries

IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , G06F9/48 , G06F17/16 , G06N3/04 , G06N3/08 , G06T1/60 , G06T15/00

Abstract: An apparatus to facilitate processing of a sparse matrix is disclosed. The apparatus includes a plurality of processing units each comprising one or more processing elements, including logic to read operands, a multiplication unit to multiply two or more operands and a scheduler to identify operands having a zero value and prevent scheduling of the operands having the zero value at the multiplication unit.

17.

发明申请
TRAINING WITH ADAPTIVE RUNTIME AND PRECISION PROFILING 审中-公开

公开(公告)号：US20180314935A1

公开(公告)日：2018-11-01

申请号：US15581031

申请日：2017-04-28

Applicant: Intel Corporation

Inventor： Brian T. Lewis , Rajkishore Barik , Murali Sundaresan , Leonard Truong

IPC: G06N3/08 , G06N3/04 , G06F7/483 , G06N3/063

Abstract: A mechanism is described for facilitating efficient training of neural networks at computing devices. A method of embodiments, as described herein, includes detecting one or more inputs for training of a neural network, and introducing randomness in floating point (FP) numbers to prevent overtraining of the neural network, where introducing randomness includes replacing less-significant low-order bits of operand and result values with new low-order bits during the training of the neural network.

18.

发明授权
Dynamic runtime task management 有权

公开(公告)号：US10073715B2

公开(公告)日：2018-09-11

申请号：US15383738

申请日：2016-12-19

Applicant: Intel Corporation

Inventor： Chunling Hu , Tatiana Shpeisman , Rajkishore Barik , Justin E. Gottschlich

IPC: G06F9/46 , G06F9/48 , G06F9/50

CPC classification number: G06F9/4881 , G06F9/5027

Abstract: A dynamic runtime scheduling system includes task manager circuitry capable of detecting a correspondence in at least a portion of the output arguments from one or more first tasks with at least a portion of the input arguments to one or more second tasks. Upon detecting the output arguments from the first task represents a superset of the second task input arguments, the task manager circuitry apportions the first task into a plurality of new subtasks. At least one of the new subtasks includes output arguments having a 1:1 correspondence to the second task input arguments. Upon detecting the output arguments from an first task represents a subset of the second task input arguments, the task manager circuitry may autonomously apportion the second task into a plurality of new subtasks. At least one of the new subtasks may include input arguments having a 1:1 correspondence to first task output arguments.

19.

发明授权
Method and apparatus to facilitate shared pointers in a heterogeneous platform 有权
Title translation: 促进异构平台中共享指针的方法和装置

公开(公告)号：US08862831B2

公开(公告)日：2014-10-14

申请号：US14020616

申请日：2013-09-06

Applicant: Intel Corporation

Inventor： Yang Ni , Rajkishore Barik , Ali-Reza Adl-Tabatabai , Tatiana Shpeisman , Jayanth N. Rao , Ben J. Ashbaugh , Tomasz Janczak

IPC: G06F12/00 , G06F13/00 , G06F13/28 , G06F15/167 , G06T1/60

CPC classification number: G06F12/0806 , G06F15/167 , G06T1/60

Abstract: A method and apparatus to facilitate shared pointers in a heterogeneous platform. In one embodiment of the invention, the heterogeneous or non-homogeneous platform includes, but is not limited to, a central processing core or unit, a graphics processing core or unit, a digital signal processor, an interface module, and any other form of processing cores. The heterogeneous platform has logic to facilitate sharing of pointers to a location of a memory shared by the CPU and the GPU. By sharing pointers in the heterogeneous platform, the data or information sharing between different cores in the heterogeneous platform can be simplified.

Abstract translation: 一种促进异构平台中的共享指针的方法和装置。在本发明的一个实施例中，异构或非均匀平台包括但不限于中央处理核心或单元，图形处理核心或单元，数字信号处理器，接口模块和任何其他形式的处理核心。异构平台具有促进共享指向CPU和GPU共享的存储器的位置的逻辑。通过在异构平台中共享指针，可以简化异构平台中不同核心之间的数据或信息共享。

20.

发明授权
Compute optimization mechanism for deep neural networks 有权

公开(公告)号：US12198221B2

公开(公告)日：2025-01-14

申请号：US18436494

申请日：2024-02-08

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F8/41 , G06F9/455 , G06F9/50 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification