-
公开(公告)号:WO2022271228A1
公开(公告)日:2022-12-29
申请号:PCT/US2022/020553
申请日:2022-03-16
Applicant: INTEL CORPORATION
Inventor: GURRAM, Chandra , CHEN, Wei-yu , FU, Fangwen , GANAPATHY, Sabareesh , GEORGE, Varghese , LUEH, Guei-Yuan , MAIYURAN, Subramaniam , MACPHERSON, Mike , PAL, Supratim , PARRA, Jorge
IPC: G06F9/30 , G06F17/16 , G06F7/483 , G06F9/30036 , G06F9/3012 , G06F9/3013 , G06F9/3891
Abstract: A processing apparatus includes a general-purpose parallel processing engine including a set of multiple processing elements including a single precision floating-point unit, a double precision floating point unit, and an integer unit; a matrix accelerator including one or more systolic arrays; a first register file coupled with a first read control circuit, wherein the first read control circuit couples with the set of multiple processing elements and the matrix accelerator to arbitrate read requests to the first register file from the set of multiple processing elements and the matrix accelerator; and a second register file coupled with a second read control circuit, wherein the second read control circuit couples with the matrix accelerator to arbitrate read requests to the second register file from the matrix accelerator and limit access to the second register file by the set of multiple processing elements.
-
公开(公告)号:WO2022271227A1
公开(公告)日:2022-12-29
申请号:PCT/US2022/020532
申请日:2022-03-16
Applicant: INTEL CORPORATION
Inventor: PARRA, Jorge , CHEN, Jiasheng , PAL, Supratim , FU, Fangwen , GANAPATHY, Sabareesh , GURRAM, Chandra , MEI, Chunhui , QI, Yue
IPC: G06F9/30 , G06F9/38 , G06F17/16 , G06F15/80 , G06F15/8046 , G06F9/3001 , G06F9/30036 , G06F9/30145 , G06F9/3802 , G06F9/382 , G06F9/3828 , G06F9/3893
Abstract: A processing apparatus described herein includes a general-purpose parallel processing engine comprising a systolic array having multiple pipelines, each of the multiple pipelines including multiple pipeline stages, wherein the multiple pipelines include a first pipeline, a second pipeline, and a common input shared between the first pipeline and the second pipeline.
-
公开(公告)号:WO2022271226A1
公开(公告)日:2022-12-29
申请号:PCT/US2022/020408
申请日:2022-03-15
Applicant: INTEL CORPORATION
Inventor: PARRA, Jorge , FU, Fangwen , MAIYURAN, Subramaniam , GEORGE, Varghese , MACPHERSON, Mike , PAL, Supratim , GURRAM, Chandra , GANAPATHY, Sabareesh , AVANCHA, Sasikanth , VOOTURI, Dharma Teja , MELLEMPUDI, Naveen , DAS, Dipankar
IPC: G06F17/16 , G06F9/00 , G06F15/8046 , G06F7/523 , G06F7/5443 , G06F9/3001 , G06F9/30036
Abstract: A processing apparatus is described herein that includes a general-purpose parallel processing engine comprising a matrix accelerator including one or more systolic arrays, at least one of the one or more systolic arrays comprising multiple pipeline stages, each pipeline stage of the multiple pipeline stages including multiple processing elements, the multiple processing elements configured to perform processing operations on input matrix elements based on output sparsity metadata. The output sparsity metadata indicates to the multiple processing elements to bypass multiplication for a first row of elements of a second matrix and multiply a second row of elements of the second matrix with a column of matrix elements of a first matrix.
-
公开(公告)号:EP4443377A3
公开(公告)日:2024-12-18
申请号:EP24193067.6
申请日:2018-11-29
Applicant: INTEL Corporation
Inventor: NURVITADHI, Eriko , BLEIWEISS, Amit , MARR, Deborah , WANG, Eugene , DWARAKAPURAM, Saritha , GANAPATHY, Sabareesh
IPC: G06T1/20
Abstract: An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data.
-
公开(公告)号:EP4359919A1
公开(公告)日:2024-05-01
申请号:EP22714700.6
申请日:2022-03-16
Applicant: Intel Corporation
Inventor: PARRA, Jorge , CHEN, Jiasheng , PAL, Supratim , FU, Fangwen , GANAPATHY, Sabareesh , GURRAM, Chandra , MEI, Chunhui , QI, Yue
CPC classification number: G06F9/30145 , G06F9/3001 , G06F9/30036 , G06F9/3893 , G06F9/3828 , G06F17/16 , G06F15/8046
-
公开(公告)号:EP3920029A1
公开(公告)日:2021-12-08
申请号:EP20208365.5
申请日:2020-11-18
Applicant: INTEL Corporation
Inventor: JIANG, Hong , GANAPATHY, Sabareesh , TIAN, Xinmin , FU, Fangwen , VALERIO, James
IPC: G06F9/52
Abstract: Examples described herein relate to a graphics processing apparatus that includes a memory device and a graphics processing unit (GPU) coupled to the memory device, the GPU can be configured to: execute an instruction thread; determine if a signal barrier is associated with the instruction thread; for a signal barrier associated with the instruction thread, determine if the signal barrier is cleared; and based on the signal barrier being cleared, permit any waiting instruction thread associated with the signal barrier identifier to commence with execution but not permit any waiting thread that is not associated with the signal barrier identifier to commence with execution. In some examples, the signal barrier includes a signal barrier identifier. In some examples, the signal barrier identifier is one of a plurality of values. In some examples, a gateway is used to receive indications of a signal barrier identifier and to selectively clear a signal barrier for a waiting instruction thread associated with the signal barrier identifier based on clearance conditions associated with the signal barrier being met.
-
公开(公告)号:EP4155900A1
公开(公告)日:2023-03-29
申请号:EP22184044.0
申请日:2022-07-11
Applicant: INTEL Corporation
Inventor: CHEN, Jiasheng , RHEE, Changwon , GANAPATHY, Sabareesh , HENRY, Gregory , FU, Fangwen
Abstract: Emulating floating point calculation using lower precision format calculations is described. An example of a processor includes a floating point unit (FPU) to provide a native floating point operation in a first precision format; and systolic array hardware including multiple data processing units, wherein the processor is to receive data for performance of a matrix multiplication operation in the first precision format; enable an emulated floating point multiplication operation using one or more values with a second precision format, the second precision format having a lower precision than the first precision format, the emulated floating point multiplication including operation of the systolic array hardware; and generate an emulated result for the matrix multiplication operation
-
公开(公告)号:EP3518176A1
公开(公告)日:2019-07-31
申请号:EP18209316.1
申请日:2018-11-29
Applicant: INTEL Corporation
Inventor: NURVITADHI, Eriko , BLEIWEISS, Amit , MARR, Deborah , WANG, Eugene , DWARAKAPURAM, Saritha , GANAPATHY, Sabareesh
IPC: G06T1/20
Abstract: An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data.
-
公开(公告)号:EP4443377A2
公开(公告)日:2024-10-09
申请号:EP24193067.6
申请日:2018-11-29
Applicant: INTEL Corporation
Inventor: NURVITADHI, Eriko , BLEIWEISS, Amit , MARR, Deborah , WANG, Eugene , DWARAKAPURAM, Saritha , GANAPATHY, Sabareesh
IPC: G06T1/20
CPC classification number: G06N3/063 , G06N3/084 , G06T1/20 , G06F7/483 , G06N3/047 , G06N3/044 , G06N3/045
Abstract: An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data.
-
公开(公告)号:EP4359967A1
公开(公告)日:2024-05-01
申请号:EP22717046.1
申请日:2022-03-15
Applicant: Intel Corporation
Inventor: PARRA, Jorge , FU, Fangwen , MAIYURAN, Subramaniam , GEORGE, Varghese , MACPHERSON, Mike , PAL, Supratim , GURRAM, Chandra , GANAPATHY, Sabareesh , AVANCHA, Sasikanth , VOOTURI, Dharma Teja , MELLEMPUDI, Naveen , DAS, Dipankar
CPC classification number: G06F17/16 , G06F15/8046 , G06F9/3001 , G06F9/30036
-
-
-
-
-
-
-
-
-