Patent search ap:("INTEL CORPORATION") AND inv:"Kamal Sinha" Page 4

31.

发明授权
Instructions and logic to perform floating-point and integer operations for machine learning 有权

公开(公告)号：US10353706B2

公开(公告)日：2019-07-16

申请号：US15819152

申请日：2017-11-21

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F9/38 , G06F7/483 , G06F7/544 , G06N3/04 , G09G5/393 , G06N3/08 , G06N3/063 , G06T15/00 , G06N20/00

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

32.

发明授权
Dynamic page sizing of page table entries 有权

公开(公告)号：US10319070B2

公开(公告)日：2019-06-11

申请号：US16120591

申请日：2018-09-04

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Joydeep Ray , Altug Koker , Balaji Vembu , Prasoonkumar P. Surti , Kamal Sinha , Vasanth Ranganathan , Kiran C. Veernapu , Bhushan M. Borole , Wenyin Fu

IPC: G06F12/00 , G06T1/60 , G06T15/00 , G06T1/20

Abstract: In accordance with one embodiment each page table entry maps a variable page size (per entry), if multiple continuous virtual pages map to contiguous physical pages. This may drastically reduce the number of translation lookaside buffer (TLB) entries needed since each entry can potentially map a larger chunk of memory, in some embodiments.

33.

发明申请
Dynamic Page Sizing of Page Table Entries 审中-公开

公开(公告)号：US20190026856A1

公开(公告)日：2019-01-24

申请号：US16120591

申请日：2018-09-04

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Joydeep Ray , Altug Koker , Balaji Vembu , Prasoonkumar P. Surti , Kamal Sinha , Vasanth Ranganathan , Kiran C. Veernapu , Bhushan M. Borole , Wenyin Fu

IPC: G06T1/60 , G06T1/20 , G06T15/00

Abstract: In accordance with one embodiment each page table entry maps a variable page size (per entry), if multiple continuous virtual pages map to contiguous physical pages. This may drastically reduce the number of translation lookaside buffer (TLB) entries needed since each entry can potentially map a larger chunk of memory, in some embodiments.

34.

发明授权
Programmable coarse grained and sparse matrix compute hardware with advanced scheduling 有权

公开(公告)号：US10186011B2

公开(公告)日：2019-01-22

申请号：US15581182

申请日：2017-04-28

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/08 , G06N3/04

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.

35.

发明授权
Pulse triggered flip flop 有权

公开(公告)号：US10158346B2

公开(公告)日：2018-12-18

申请号：US15488628

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Bhushan M. Borole , Anupama A. Thaploo , Altug Koker , Abhishek R. Appu , Kamal Sinha , Wenyin Fu

IPC: H03K3/012 , H03K3/356 , H03K19/21

Abstract: A pulse triggered flip flop circuit includes an exclusive OR clock generating stage that receives an input clock, data and produces an output clock pulse. The stage produces a output clock pulse that only goes away when the data is fully captured. The stage disables the output clock pulse only when the data is fully captured. Moreover, the circuit only toggles when the input data changes, reducing power consumption in some embodiments.

36.

发明申请
STORAGE MANAGEMENT FOR MACHINE LEARNING AT AUTONOMOUS MACHINES 审中-公开

公开(公告)号：US20180314249A1

公开(公告)日：2018-11-01

申请号：US15581124

申请日：2017-04-28

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Altug Koker , Farshad Akhbari , Feng Chen , Dukhwan Kim , Narayan Srinivasa , Nadathur Rajagopalan Satish , Kamal Sinha , Joydeep Ray , Balaji Vembu , Mike B. Macpherson , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan

IPC: G05D1/00 , G06N3/08 , G06N3/04 , G06N3/063

CPC classification number: G06F9/5016 , G06F9/5061

Abstract: A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.

37.

发明申请
COMPRESSION MECHANISM 审中-公开

公开(公告)号：US20180308256A1

公开(公告)日：2018-10-25

申请号：US15494812

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , Prasoonkumar Surti , Kamal Sinha , Nadathur Rajagopalan Satish , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Farshad Akhbari

IPC: G06T9/00 , G06T1/20 , G06T1/60

CPC classification number: G06T9/00 , G06N3/0445 , G06N3/0454 , G06N3/0481 , G06N3/063 , G06N3/084

Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.

38.

发明申请
Replacement Policies for a Hybrid Hierarchical Cache 审中-公开

公开(公告)号：US20180300260A1

公开(公告)日：2018-10-18

申请号：US15488840

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Joydeep Ray , James A. Valerio , Altug Koker , Prasoonkumar P. Surti , Balaji Vembu , Wenyin FU , Bhushan M. Borole , Kamal Sinha

IPC: G06F12/128 , G06F12/0811 , G06F13/40 , G06T1/20

Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.

39.

发明申请
EXTEND GPU/CPU COHERENCY TO MULTI-GPU CORES 审中-公开

公开(公告)号：US20180300246A1

公开(公告)日：2018-10-18

申请号：US15489149

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker

IPC: G06F12/0837 , G06N3/08 , G06N99/00

CPC classification number: G06F12/0837 , G06F2212/62 , G06N3/08 , G06N99/005 , G06T1/20

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

40.

发明申请
Frequent Data Value Compression for Graphics Processing Units 审中-公开

公开(公告)号：US20180293695A1

公开(公告)日：2018-10-11

申请号：US15483236

申请日：2017-04-10

Applicant: Intel Corporation

Inventor： Saurabh Sharma , Abhishek Venkatesh , Travis T. Schluessler , Prasoonkumar Surti , Altug Koker , Aravindh V. Anantaraman , Pattabhiraman P. K. , Abhishek R. Appu , Joydeep Ray , Kamal Sinha , Vasanth Ranganathan , Bhushan M. Borole , Wenyin Fu , Eric J. Hoekstra , Linda L. Hurd

IPC: G06T1/20 , G06T1/60 , G06T15/00

CPC classification number: G06T1/20 , G06T1/60 , G06T15/005

Abstract: A control surface tracks an individual cacheline in the original surface for frequent data values. If so, control surface bits are set. When reading a cacheline from memory, first the control surface bits are read. If they happen to be set, then the original memory read is skipped altogether and instead the bits from the control surface provide the value for the entire cacheline.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification