-
公开(公告)号:US12014265B2
公开(公告)日:2024-06-18
申请号:US18302889
申请日:2023-04-19
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Amit Bleiweiss , Deborah Marr , Eugene Wang , Saritha Dwarakapuram , Sabareesh Ganapathy
IPC: G06T1/20 , G06F7/52 , G06F16/901 , G06F17/16 , G06F18/214 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084 , G06N20/00 , G06T15/00 , G06N3/047
CPC classification number: G06N3/063 , G06F7/52 , G06F16/9024 , G06F17/16 , G06F18/214 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/08 , G06N3/084 , G06N20/00 , G06T1/20 , G06T15/005 , G06N3/047
Abstract: An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data and customizable circuitry to provide custom functions.
-
2.
公开(公告)号:US20230359695A1
公开(公告)日:2023-11-09
申请号:US18222989
申请日:2023-07-17
Applicant: Intel Corporation
Inventor: Jack Z. Yinger , Andrew Ling , Tomasz Czajkowski , Davor Capalija , Eriko Nurvitadhi , Deborah Marr
CPC classification number: G06F17/16 , G06F7/5443 , G06F2207/3892
Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
-
公开(公告)号:US11416248B2
公开(公告)日:2022-08-16
申请号:US16833597
申请日:2020-03-28
Applicant: Intel Corporation
Inventor: Jaewoong Sim , Alaa Alameldeen , Eriko Nurvitadhi , Deborah Marr
Abstract: An apparatus and method for compressing floating-point values. For example, one embodiment of a processor comprises: instruction fetch circuitry to fetch instructions from a memory, the instructions including floating-point instructions; execution circuitry to execute the floating-point instructions, each floating-point instruction having one or more floating-point operands, each floating-point operand comprising an exponent value and a significand value; floating-point compression circuitry to compress a plurality of the exponent values associated with a corresponding plurality of the floating-point operands, the floating-point compression circuitry comprising: base generation circuitry to evaluate the plurality of the exponent values to generate a first base value; and delta generation circuitry to determine a difference between the plurality of exponent values and the first base value and to generate a corresponding first plurality of delta values, wherein the floating-point compression circuitry is to store the first base value and the corresponding first plurality of delta values as a plurality of compressed exponent values.
-
4.
公开(公告)号:US20190012295A1
公开(公告)日:2019-01-10
申请号:US15644526
申请日:2017-07-07
Applicant: Intel Corporation
Inventor: Jack Z. Yinger , Andrew Ling , Tomasz Czajkowski , Davor Capalija , Eriko Nurvitadhi , Deborah Marr
Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
-
公开(公告)号:US12039435B2
公开(公告)日:2024-07-16
申请号:US17845794
申请日:2022-06-21
Applicant: Intel Corporation
Inventor: Amit Bleiweiss , Anavai Ramesh , Asit Mishra , Deborah Marr , Jeffrey Cook , Srinivas Sridharan , Eriko Nurvitadhi , Elmoustapha Ould-Ahmed-Vall , Dheevatsa Mudigere , Mohammad Ashraf Bhuiyan , Md Faijul Amin , Wei Wang , Dhawal Srivastava , Niharika Maheshwari
CPC classification number: G06N3/063 , G06F7/78 , G06F9/00 , G06N3/084 , G06N20/00 , G06F2207/4824 , G06T1/20
Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.
-
6.
公开(公告)号:US20230064381A1
公开(公告)日:2023-03-02
申请号:US17740057
申请日:2022-05-09
Applicant: Intel Corporation
Inventor: Jack Z. Yinger , Andrew Ling , Tomasz Czajkowski , Davor Capalija , Eriko Nurvitadhi , Deborah Marr
Abstract: Matrix multiplication systolic array feed methods and related processing element (PE) microarchitectures for efficiently implementing systolic array generic matrix multiplier (SGEMM) in integrated circuits is provided. A systolic array architecture may include a processing element array, a column feeder array, and a row feeder array. A bandwidth of external memory may be reduced by a factor of reduction based on interleaving of the matrix data via a feeding pattern of the column feeder array and the row feeder array.
-
公开(公告)号:US10915328B2
公开(公告)日:2021-02-09
申请号:US16220528
申请日:2018-12-14
Applicant: Intel Corporation
Inventor: Jonathan Pearce , David Sheffield , Srikanth Srinivasan , Jeffrey Cook , Deborah Marr
Abstract: An apparatus and method for offloading iterative, parallel work to a data parallel cluster. For example, one embodiment of a processor comprises: a host processor to execute a primary thread; a data parallel cluster coupled to the host processor over a high speed interconnect, the data parallel cluster comprising a plurality of execution lanes to perform parallel execution of one or more secondary threads related to the primary thread; and a data parallel cluster controller integral to the host processor to offload processing of the one or more secondary threads to the data parallel cluster in response to one of the cores executing a parallel processing call instruction from the primary thread.
-
8.
公开(公告)号:US10146738B2
公开(公告)日:2018-12-04
申请号:US15396511
申请日:2016-12-31
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Deborah Marr
Abstract: An accelerator architecture for processing very-sparse and hyper-sparse matrix data is disclosed. A hardware accelerator comprises one or more tiles, each including a plurality of processing elements (PEs) and a data management unit (DMU). The PEs are to perform matrix operations involving very- or hyper-sparse matrices that are stored by a memory. The DMU is to provide the plurality of PEs access to the memory via an interface that is optimized to provide low-latency, parallel, random accesses to the memory. The PEs, via the DMU, perform the matrix operations by, issuing random access read requests for values of the one or more matrices, issuing random access read requests for values of one or more vectors serving as a second operand, and issuing random access write requests for values of one or more vectors serving as a result.
-
公开(公告)号:US20240403620A1
公开(公告)日:2024-12-05
申请号:US18679802
申请日:2024-05-31
Applicant: Intel Corporation
Inventor: Amit Bleiweiss , Anavai Ramesh , Asit Mishra , Deborah Marr , Jeffrey Cook , Srinivas Sridharan , Eriko Nurvitadhi , Elmoustapha Ould-Ahmed-Vall , Dheevatsa Mudigere , Mohammad Ashraf Bhuiyan , Md Faijul Amin , Wei Wang , Dhawal Srivastava , Niharika Maheshwari
Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.
-
公开(公告)号:US20230053289A1
公开(公告)日:2023-02-16
申请号:US17845794
申请日:2022-06-21
Applicant: Intel Corporation
Inventor: Amit Bleiweiss , Anavai Ramesh , Asit Mishra , Deborah Marr , Jeffrey Cook , Srinivas Sridharan , Eriko Nurvitadhi , Elmoustapha Ould-Ahmed-Vall , Dheevatsa Mudigere , Mohammad Ashraf Bhuiyan , Md Faijul Amin , Wei Wang , Dhawal Srivastava , Niharika Maheshwari
Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.
-
-
-
-
-
-
-
-
-