Patent search ap:("INTEL CORPORATION") AND inv:"Kamal Sinha" Page 12

111.

发明授权
Data prefetching for graphics data processing 有权

公开(公告)号：US10909039B2

公开(公告)日：2021-02-02

申请号：US16355015

申请日：2019-03-15

Applicant: Intel Corporation

Inventor： Vikranth Vemulapalli , Lakshminarayanan Striramassarma , Mike MacPherson , Aravindh Anantaraman , Ben Ashbaugh , Murali Ramadoss , William B. Sadler , Jonathan Pearce , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter, Jr. , Prasoonkumar Surti , Nicolas Galoppo von Borries , Joydeep Ray , Abhishek R. Appu , ElMoustapha Ould-Ahmed-Vall , Altug Koker , Sungye Kim , Subramaniam Maiyuran , Valentin Andrei

IPC: G09G5/36 , G06F12/0862 , G06T1/20 , G06T1/60

Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.

112.

发明申请
MACHINE LEARNING SPARSE COMPUTATION MECHANISM 审中-公开

公开(公告)号：US20200349670A1

公开(公告)日：2020-11-05

申请号：US16930841

申请日：2020-07-16

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries

IPC: G06T1/20 , G06F17/16 , G06T1/60 , G06F9/30 , G06K9/62 , G06F12/0888 , G06F12/0815 , H03M7/30 , G06F9/48 , G06T15/00 , G06N3/04 , G06F9/38 , G06F12/0831 , G06F12/0811 , G06N3/08

Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.

113.

发明申请
Replacement Policies for a Hybrid Hierarchical Cache 审中-公开

公开(公告)号：US20200349091A1

公开(公告)日：2020-11-05

申请号：US16881271

申请日：2020-05-22

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Joydeep Ray , James A. Valerio , Altug Koker , Prasoonkumar Surti , Balaji Vembu , Wenyin Fu , Bhushan M. Borole , Kamal Sinha

IPC: G06F12/128 , G06F12/0811 , G06F13/40 , G06F12/12 , G06T1/60 , G06F12/0897 , G06F12/084

Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.

114.

发明申请
Regional Adjustment of Render Rate 审中-公开

公开(公告)号：US20200348897A1

公开(公告)日：2020-11-05

申请号：US16881262

申请日：2020-05-22

Applicant: Intel Corporation

Inventor： Eric J. Asperheim , Subramaniam M. Maiyuran , Kiran C. Veernapu , Sanjeev S. Jahagirdar , Balaji Vembu , Devan Burke , Philip R. Laws , Kamal Sinha , Abhishek R. Appu , Elmoustapha Ould-Ahmed-Vall , Peter L. Doyle , Joydeep Ray , Travis T. Schluessler , John H. Feit , Nikos Kaburlasos , Jacek Kwiatkowski , Altug Koker

IPC: G06F3/14 , G06F3/01 , G09G5/391 , G06F3/0484

Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.

115.

发明申请
DATA PREFETCHING FOR GRAPHICS DATA PROCESSING 审中-公开

公开(公告)号：US20200293450A1

公开(公告)日：2020-09-17

申请号：US16355015

申请日：2019-03-15

Applicant: Intel Corporation

Inventor： Vikranth Vemulapalli , Lakshminarayanan Striramassarma , Mike MacPherson , Aravindh Anantaraman , Ben Ashbaugh , Murali Ramadoss , William B. Sadler , Jonathan Pearce , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter, JR. , Prasoonkumar Surti , Nicolas Galoppo von Borries , Joydeep Ray , Abhishek R. Appu , ElMoustapha Ould-Ahmed-Vall , Altug Koker , Sungye Kim , Subramaniam Maiyuran , Valentin Andrei

IPC: G06F12/0862 , G06T1/60 , G06T1/20

Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.

116.

发明授权
Programmable coarse grained and sparse matrix compute hardware with advanced scheduling 有权

公开(公告)号：US10769748B2

公开(公告)日：2020-09-08

申请号：US16197783

申请日：2018-11-21

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC: G06T1/20 , G06N3/04 , G06N3/063 , G06F9/38 , G06F9/30 , G06N3/08

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.

117.

发明授权
Avoid cache lookup for cold cache 有权

公开(公告)号：US10769072B2

公开(公告)日：2020-09-08

申请号：US16277114

申请日：2019-02-15

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Joydeep Ray , Prasoonkumar Surti , Kamal Sinha , Kiran C. Veernapu , Balaji Vembu

IPC: G06F12/08 , G06F12/0888 , G06F13/42 , G06F13/40 , G06T1/60 , G06F12/0895 , G06T1/20

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.

118.

发明申请
MACHINE LEARNING SPARSE COMPUTATION MECHANISM 审中-公开

公开(公告)号：US20200279349A1

公开(公告)日：2020-09-03

申请号：US16880338

申请日：2020-05-21

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajikshore Barik , Nicolas C. Galoppo Von Borries

IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , H03M7/30 , G06K9/62 , G06F9/48 , G06F17/16 , G06N3/04 , G06N3/08 , G06T1/60 , G06T15/00

Abstract: An apparatus to facilitate processing of a sparse matrix is disclosed. The apparatus includes a plurality of processing units each comprising one or more processing elements, including logic to read operands, a multiplication unit to multiply two or more operands and a scheduler to identify operands having a zero value and prevent scheduling of the operands having the zero value at the multiplication unit.

119.

发明授权
Reducing aging of register file keeper circuits 有权

公开(公告)号：US10754809B2

公开(公告)日：2020-08-25

申请号：US16354312

申请日：2019-03-15

Applicant: Intel Corporation

Inventor： Anupama A. Thaploo , Bhushan M. Borole , Bee Ngo , Iqbal R. Rajwani , Altug Koker , Abhishek R. Appu , Kamal Sinha , Wenyin Fu

IPC: G06F13/40 , G11C17/18 , G11C17/16 , G11C11/4094 , G06F9/30 , G11C7/12

Abstract: By shutting off keeper transistors during pre-charge, the aging on these devices may be reduced. This means that a relatively weaker keeper may be used for noise compared to an overdesigned stronger keeper. Using a relatively weaker keeper circuit results in a faster evaluation stage and improved minimum read voltage in some embodiments.

120.

发明授权
Frequent data value compression for graphics processing units 有权

公开(公告)号：US10748238B2

公开(公告)日：2020-08-18

申请号：US16279270

申请日：2019-02-19

Applicant: Intel Corporation

Inventor： Saurabh Sharma , Abhishek Venkatesh , Travis T. Schluessler , Prasoonkumar Surti , Altug Koker , Aravindh V. Anantaraman , Pattabhiraman P. K. , Abhishek R. Appu , Joydeep Ray , Kamal Sinha , Vasanth Ranganathan , Bhushan M. Borole , Wenyin Fu , Eric J. Hoekstra , Linda L. Hurd

IPC: G06T1/20 , G06T1/60 , G06T15/00

Abstract: A control surface tracks an individual cacheline in the original surface for frequent data values. If so, control surface bits are set. When reading a cacheline from memory, first the control surface bits are read. If they happen to be set, then the original memory read is skipped altogether and instead the bits from the control surface provide the value for the entire cacheline.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification