Patent search ap:("INTEL CORPORATION") AND inv:"Mike Macpherson" Page 6

51.

发明申请
SYSTOLIC ARRAY HAVING SUPPORT FOR OUTPUT SPARSITY 有权

公开(公告)号：US20220413803A1

公开(公告)日：2022-12-29

申请号：US17304803

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Jorge Parra , Fangwen Fu , Subramaniam Maiyuran , Varghese George , Mike Macpherson , Supratim Pal , Chandra Gurram , Sabareesh Ganapathy , Sasikanth Avancha , Dharma Teja Vooturi , Naveen Mellempudi , Dipankar Das

IPC: G06F7/544 , G06F7/523 , G06F15/80 , G06F17/16

Abstract: A processing apparatus is described herein that includes a general-purpose parallel processing engine comprising a matrix accelerator including one or more systolic arrays, at least one of the one or more systolic arrays comprising multiple pipeline stages, each pipeline stage of the multiple pipeline stages including multiple processing elements, the multiple processing elements configured to perform processing operations on input matrix elements based on output sparsity metadata. The output sparsity metadata indicates to the multiple processing elements to bypass multiplication for a first row of elements of a second matrix and multiply a second row of elements of the second matrix with a column of matrix elements of a first matrix.

52.

发明申请
SYSTEMS AND METHODS FOR IMPROVING CACHE EFFICIENCY AND UTILIZATION 有权

公开(公告)号：US20220261347A1

公开(公告)日：2022-08-18

申请号：US17732308

申请日：2022-04-28

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Ben Ashbaugh , Jonathan Pearce , Abhishek Appu , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Elmoustapha Ould-Ahmed-Vall , Aravindh Anantaraman , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Yoav Harel , Arthur Hunter,, JR. , Brent Insko , Scott Janus , Pattabhiraman K , Mike Macpherson , Subramaniam Maiyuran , Marian Alin Petre , Murali Ramadoss , Shailesh Shah , Kamal Sinha , Prasoonkumar Surti , Vikranth Vemulapalli

IPC: G06F12/0802 , G06F9/30

Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.

53.

发明申请
MEMORY PREFETCHING IN MULTIPLE GPU ENVIRONMENT 有权

公开(公告)号：US20220222767A1

公开(公告)日：2022-07-14

申请号：US17580352

申请日：2022-01-20

Applicant: Intel Corporation

Inventor： Joydeep Ray , Aravindh Anantaraman , Valentin Andrei , Abhishek R. Appu , Nicolas Galoppo von Borries , Varghese George , Altug Koker , Elmoustapha Ould-Ahmed-Vall , Mike Macpherson , Subramaniam Maiyuran

IPC: G06T1/20 , G06F9/38 , G06T1/60 , G06T15/00

Abstract: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.

54.

发明申请
SYSTEMS AND METHODS FOR UPDATING MEMORY SIDE CACHES IN A MULTI-GPU CONFIGURATION 有权

公开(公告)号：US20220180467A1

公开(公告)日：2022-06-09

申请号：US17428534

申请日：2020-03-14

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Aravindh Anantaraman , Valentin Andrei , Abhishek Appu , Sean Coleman , Nicolas Galoppo Von Borries , Varghese George , Pattabhiraman K , SungYe Kim , Mike Macpherson , Subramaniam Maiyuran , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , James Valerio

IPC: G06T1/20 , G06F12/0804 , G06F12/0811 , G06T1/60

Abstract: Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a first memory, a first memory side cache memory, a first communication fabric, and a first memory management unit (MMU). The graphics processor includes a second graphics processing unit (GPU) having a second memory, a second memory side cache memory, a second memory management unit (MMU), and a second communication fabric that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.

55.

发明申请
SYSTEMS AND METHODS FOR IMPROVING CACHE EFFICIENCY AND UTILIZATION 有权

公开(公告)号：US20220179787A1

公开(公告)日：2022-06-09

申请号：US17428530

申请日：2020-03-14

Applicant: Intel Corporation

Inventor： Altug Koker , Joydeep Ray , Ben Ashbaugh , Jonathan Pearce , Abhishek Appu , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Elmoustapha Ould-Ahmed-Vall , Aravindh Anantaraman , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Yoav Harel , Arthur Hunter, Jr. , Brent Insko , Scott Janus , Pattabhiraman K , Mike Macpherson , Subramaniam Maiyuran , Marian Alin Petre , Murali Ramadoss , Shailesh Shah , Kamal Sinha , Prasoonkumar Surti , Vikranth Vemulapalli

IPC: G06F12/0802 , G06F9/30

Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.

56.

发明申请
MEMORY CONTROLLER MANAGEMENT TECHNIQUES 有权

公开(公告)号：US20220138101A1

公开(公告)日：2022-05-05

申请号：US17430611

申请日：2020-03-14

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Aravindh Anantaraman , Elmoustapha Ould-Ahmed-Vall , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Altug Koker , Mike Macpherson , Subramaniam Maiyuran , Joydeep Ray , Lakshminarayana Pappu , Guadalupe Garcia

IPC: G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0875 , G06F12/02 , G06F9/30 , G06T15/06

Abstract: Methods and apparatus relating to memory controller techniques. In an example, an apparatus comprises a cache memory, a high-bandwidth memory, and a processor communicatively coupled to the cache memory and the high-bandwidth memory, the processor to manage data transfer between the cache memory and the high-bandwidth memory for memory access operations directed to the high-bandwidth memory. Other embodiments are also disclosed and claimed.

57.

发明申请
SYSTOLIC DISAGGREGATION WITHIN A MATRIX ACCELERATOR ARCHITECTURE 有权

公开(公告)号：US20220129521A1

公开(公告)日：2022-04-28

申请号：US17428233

申请日：2020-03-14

Applicant: INTEL CORPORATION

Inventor： Prasoonkumar Surti , Subramaniam Maiyuran , Valentin Andrei , Abhishek Appu , Varghese George , Altug Koker , Mike Macpherson , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , Joydeep Ray , Lakshminarayanan Striramassarma , SungYe Kim

IPC: G06F17/16 , G06F15/80

Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides techniques to optimize training and inference on a systolic array when using sparse data. One embodiment provides techniques to use decompression information when performing sparse compute operations. One embodiment enables the disaggregation of special function compute arrays via a shared reg file. One embodiment enables packed data compress and expand operations on a GPGPU. One embodiment provides techniques to exploit block sparsity within the cache hierarchy of a GPGPU.

58.

发明授权
Local memory sharing between kernels 有权

公开(公告)号：US11119820B2

公开(公告)日：2021-09-14

申请号：US16354957

申请日：2019-03-15

Applicant: Intel Corporation

Inventor： Valentin Andrei , Aravindh Anantaraman , Abhishek R. Appu , Nicolas C. Galoppo von Borries , Altug Koker , SungYe Kim , Elmoustapha Ould-Ahmed-Vall , Mike Macpherson , Subramaniam Maiyuran , Vasanth Ranganathan , Joydeep Ray , Varghese George

IPC: G06F9/44 , G06F9/48 , G06F13/16 , G06F13/42 , G06N3/08 , G06T1/20

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a set of processing elements to execute one or more thread groups of a second kernel to be executed by the general-purpose graphics processor, an on-chip memory coupled to the set of processing elements, and a scheduler coupled with the set of processing elements, the scheduler to schedule the thread groups of the kernel to the set of processing elements, wherein the scheduler is to schedule a thread group of the second kernel to execute subsequent to a thread group of a first kernel, the thread group of the second kernel configured to access a region of the on-chip memory that contains data written by the thread group of the first kernel in response to a determination that the second kernel is dependent upon the first kernel.

59.

发明授权
Sparse optimizations for a matrix accelerator architecture 有权

公开(公告)号：US11113784B2

公开(公告)日：2021-09-07

申请号：US17064427

申请日：2020-10-06

Applicant: Intel Corporation

Inventor： Joydeep Ray , Scott Janus , Varghese George , Subramaniam Maiyuran , Altug Koker , Abhishek Appu , Prasoonkumar Surti , Vasanth Ranganathan , Andrei Valentin , Ashutosh Garg , Yoav Harel , Arthur Hunter, Jr. , SungYe Kim , Mike Macpherson , Elmoustapha Ould-Ahmed-Vall , William Sadler , Lakshminarayanan Striramassarma , Vikranth Vemulapalli

IPC: G06T1/20 , G06F9/50 , G06F15/80 , G06F12/0806 , G06F17/16 , G06N3/04 , G06N3/08 , G06F7/544

Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to skip computational operations for zero filled matrices and sub-matrices. Embodiments additionally provide techniques to maintain data compression through to a processing unit. Embodiments additionally provide an architecture for a sparse aware logic unit.

60.

发明授权
Scalar core integration 有权

公开(公告)号：US11016929B2

公开(公告)日：2021-05-25

申请号：US16354782

申请日：2019-03-15

Applicant: Intel Corporation

Inventor： Joydeep Ray , Aravindh Anantaraman , Abhishek R. Appu , Altug Koker , Elmoustapha Ould-Ahmed-Vall , Valentin Andrei , Subramaniam Maiyuran , Nicolas Galappo Von Borries , Varghese George , Mike Macpherson , Ben Ashbaugh , Murali Ramadoss , Vikranth Vemulapalli , William Sadler , Jonathan Pearce , Sungye Kim

IPC: G06T1/00 , G06F15/80 , G06F9/30 , G06F9/38 , G06T15/00

Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification