Patent search ap:("Intel Corporation") AND inv:"Rajkishore Barik" Page 12

111.

发明授权
Instructions and logic to perform floating-point and integer operations for machine learning 有权

公开(公告)号：US11169799B2

公开(公告)日：2021-11-09

申请号：US16432402

申请日：2019-06-05

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F9/38 , G06F7/483 , G06F7/544 , G06N3/063 , G06N20/00 , G09G5/393 , G06N3/04 , G06N3/08 , G06T15/00

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

112.

发明申请
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 有权

公开(公告)号：US20210241417A1

公开(公告)日：2021-08-05

申请号：US17145885

申请日：2021-01-11

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F9/455 , G06F9/50 , G06N3/04 , G06N3/063 , G06N3/08

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type.

113.

发明授权
Compute optimizations for neural networks using bipolar binary weight 有权

公开(公告)号：US11074072B2

公开(公告)日：2021-07-27

申请号：US16505012

申请日：2019-07-08

Applicant: Intel Corporation

Inventor： Kevin Nealis , Anbang Yao , Xiaoming Chen , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha

IPC: G06F9/30 , G06F9/38 , G06N3/04 , G06N3/063 , G06N3/08 , G06T1/20

Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a bipolar binary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the bipolar binary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.

114.

发明申请
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 有权

公开(公告)号：US20210182058A1

公开(公告)日：2021-06-17

申请号：US17169232

申请日：2021-02-05

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06N3/04 , G06F9/38 , G06F7/483 , G09G5/393 , G06F7/544 , G06N3/063 , G06N3/08

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

115.

发明授权
Specialized fixed function hardware for efficient convolution 有权

公开(公告)号：US10824938B2

公开(公告)日：2020-11-03

申请号：US15494723

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Rajkishore Barik , Elmoustapha Ould-Ahmed-Vall , Xiaoming Chen , Dhawal Srivastava , Anbang Yao , Kevin Nealis , Eriko Nurvitadhi , Sara S. Baghsorkhi , Balaji Vembu , Tatiana Shpeisman , Ping T. Tang

IPC: G06N3/06 , G06N3/063 , G06N3/04 , G06F9/30 , G06F9/38 , G06T1/20 , G06N3/08

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to perform one or more machine learning operations, wherein the decode unit, based on parameters of the one or more machine learning operations, is to request a scheduler to schedule the one or more machine learning operations to one of an array of programmable compute units and a fixed function compute unit.

116.

发明申请
EXTEND GPU/CPU COHERENCY TO MULTI-GPU CORES 审中-公开

公开(公告)号：US20200210338A1

公开(公告)日：2020-07-02

申请号：US16727127

申请日：2019-12-26

Applicant: Intel Corporation

Inventor： Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker

IPC: G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20 , G06F12/0815 , G06N3/063 , G06N3/04

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

117.

发明申请
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20200034946A1

公开(公告)日：2020-01-30

申请号：US16531763

申请日：2019-08-05

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F3/14 , G06N3/04 , G06N3/063 , G06N3/08 , G06T1/60 , G09G5/36 , G06F3/06

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.

118.

发明申请
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING 审中-公开

公开(公告)号：US20180315158A1

公开(公告)日：2018-11-01

申请号：US15581182

申请日：2017-04-28

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/08 , G06N3/04

CPC classification number: G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/084

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.

119.

发明申请
TOOL FOR FACILITATING EFFICIENCY IN MACHINE LEARNING 审中-公开

公开(公告)号：US20180314936A1

公开(公告)日：2018-11-01

申请号：US15581152

申请日：2017-04-28

Applicant: Intel Corporation

Inventor： Rajkishore Barik , Brian T. Lewis , Murali Sundaresan , Jeffrey Jackson , Feng Chen , Xiaoming Chen , Mike Macpherson

IPC: G06N3/08 , G06N3/04 , G06N3/063

Abstract: A mechanism is described for facilitating smart distribution of resources for deep learning autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application at a computing device.

120.

发明申请
METHOD AND APPARATUS TO FACILITATE SHARED POINTERS IN A HETEROGENEOUS PLATFORM 审中-公开
Title translation: 在异位平台上配置共享点的方法和装置

公开(公告)号：US20150186273A1

公开(公告)日：2015-07-02

申请号：US14513065

申请日：2014-10-13

Applicant: Intel Corporation

Inventor： Yang Ni , Rajkishore Barik , Ali-Reza Adl-Tabatabai , Tatiana Shpeisman , Jayanth N. Rao , Ben J. Ashbaugh , Tomasz Janczak

IPC: G06F12/08

CPC classification number: G06F12/0806 , G06F15/167 , G06T1/60

Abstract: A method and apparatus to facilitate shared pointers in a heterogeneous platform. In one embodiment of the invention, the heterogeneous or non-homogeneous platform includes, but is not limited to, a central processing core or unit, a graphics processing core or unit, a digital signal processor, an interface module, and any other form of processing cores. The heterogeneous platform has logic to facilitate sharing of pointers to a location of a memory shared by the CPU and the GPU. By sharing pointers in the heterogeneous platform, the data or information sharing between different cores in the heterogeneous platform can be simplified.

Abstract translation: 一种促进异构平台中的共享指针的方法和装置。在本发明的一个实施例中，异构或非均匀平台包括但不限于中央处理核心或单元，图形处理核心或单元，数字信号处理器，接口模块和任何其他形式的处理核心。异构平台具有促进共享指向CPU和GPU共享的存储器的位置的逻辑。通过在异构平台中共享指针，可以简化异构平台中不同核心之间的数据或信息共享。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification