-
公开(公告)号:US20250037347A1
公开(公告)日:2025-01-30
申请号:US18358297
申请日:2023-07-25
Applicant: Intel Corporation
Inventor: Jiasheng Chen , Supratim Pal , Kevin Hurd , Jorge E. Parra Osorio , Christopher Spencer , Takashi Nakagawa , Guei-Yuan Lueh , Pradeep K. Golconda , James Valerio , Mukundan Swaminathan , Nicholas Murphy , Clifford Gibson , Li-An Tang , Fangwen Fu , Kaiyu Chen , Buqi Cheng
Abstract: Described herein is a graphics processor comprising an instruction cache and a plurality of processing elements coupled with the instruction cache. The plurality of processing elements include functional units configured to provide an integer pipeline to execute instructions to perform operations on integer data elements. The integer pipeline including a first multiplier and a second multiplier, the first multiplier and the second multiplier configured to execute operations for a single instruction.
-
公开(公告)号:US12198222B2
公开(公告)日:2025-01-14
申请号:US18532245
申请日:2023-12-07
Applicant: Intel Corporation
Inventor: Abhishek Appu , Subramaniam Maiyuran , Mike Macpherson , Fangwen Fu , Jiasheng Chen , Varghese George , Vasanth Ranganathan , Ashutosh Garg , Joydeep Ray
IPC: G06F17/16 , G06F7/544 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/0806 , G06F15/80 , G06N3/048 , G06N3/08 , G06N3/084 , G06T1/20
Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides for data aware sparsity via compressed bitstreams. One embodiment provides for block sparse dot product instructions. One embodiment provides for a depth-wise adapter for a systolic array.
-
公开(公告)号:US20240220254A1
公开(公告)日:2024-07-04
申请号:US18148997
申请日:2022-12-30
Applicant: Intel Corporation
Inventor: Chunhui Mei , Yongsheng Liu , John A. Wiegert , Vasanth Ranganathan , Ben J. Ashbaugh , Fangwen Fu , Hong Jiang , Guei-Yuan Lueh , James Valerio , Alan M. Curtis , Maxim Kazakov
CPC classification number: G06F9/30087 , G06F9/3877 , G06F9/5072 , G06F9/544
Abstract: Data multicast in compute core clusters is described. An example of an apparatus includes one or more processors including at least a first processor, the first processor including one or more clusters of cores and a memory, wherein each cluster of cores includes multiple cores, each core including one or more processing resources, shared memory, and broadcast circuitry; and wherein a first core in a first cluster of cores is to request a data element, determine whether any additional cores in the first cluster require the data element, and, upon determining that one or more additional cores in the first cluster require the data element, broadcast the data element to the one or more additional cores via interconnects between the broadcast circuitry of the cores of the first core cluster.
-
公开(公告)号:US20240161227A1
公开(公告)日:2024-05-16
申请号:US18532245
申请日:2023-12-07
Applicant: Intel Corporation
Inventor: Abhishek Appu , Subramaniam Maiyuran , Mike Macpherson , Fangwen Fu , Jiasheng Chen , Varghese George , Vasanth Ranganathan , Ashutosh Garg , Joydeep Ray
IPC: G06T1/20 , G06F7/544 , G06F9/50 , G06F12/0806 , G06F15/80 , G06F17/16 , G06N3/048 , G06N3/08 , G06N3/084
CPC classification number: G06T1/20 , G06F7/5443 , G06F9/5027 , G06F12/0806 , G06F15/8046 , G06F17/16 , G06N3/048 , G06N3/08 , G06N3/084
Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides for data aware sparsity via compressed bitstreams. One embodiment provides for block sparse dot product instructions. One embodiment provides for a depth-wise adapter for a systolic array.
-
公开(公告)号:US20240111826A1
公开(公告)日:2024-04-04
申请号:US17937252
申请日:2022-09-30
Applicant: Intel Corporation
Inventor: Jiasheng Chen , Kevin Hurd , Changwon Rhee , Jorge Parra , Fangwen Fu , Theo Drane , William Zorn , Peter Caday , Gregory Henry , Guei-Yuan Lueh , Farzad Chehrazi , Amit Karande , Turbo Majumder , Xinmin Tian , Milind Girkar , Hong Jiang
CPC classification number: G06F17/16 , G06F7/5443 , G06T1/20
Abstract: An apparatus to facilitate hardware enhancements for double precision systolic support is disclosed. The apparatus includes matrix acceleration hardware having double-precision (DP) matrix multiplication circuitry including a multiplier circuits to multiply pairs of input source operands in a DP floating-point format; adders to receive multiplier outputs from the multiplier circuits and accumulate the multiplier outputs in a high precision intermediate format; an accumulator circuit to accumulate adder outputs from the adders with at least one of a third global source operand on a first pass of the DP matrix multiplication circuitry or an intermediate result from the first pass on a second pass of the DP matrix multiplication circuitry, wherein the accumulator circuit to generate an accumulator output in the high precision intermediate format; and a down conversion and rounding circuit to down convert and round an output of the second pass as final result in the DP floating-point format.
-
公开(公告)号:US20240104025A1
公开(公告)日:2024-03-28
申请号:US17951914
申请日:2022-09-23
Applicant: Intel Corporation
Inventor: Biju George , Zamshed I. Chowdhury , Prathamesh Raghunath Shinde , Chunhui Mei , Fangwen Fu
IPC: G06F12/123 , G06F12/0862
CPC classification number: G06F12/123 , G06F12/0862 , G06F2212/1021
Abstract: Prefetch aware LRU cache replacement policy is described. An example of an apparatus includes one or more processors including a graphic processor, the graphics processor including a load store cache having multiple cache lines (CLs), each including bits for a cache line level (CL level) and one or more sectors for data storage; wherein the graphics processor is to receive one or more data elements for storage in the cache; set a CL level to track each CL receiving data, including setting CL level 1 for a CL receiving data in response to a miss in the cache and setting a CL level 2 for a CL receiving prefetched data in response to a prefetch request, and, upon determining that space is required in the cache to store data, apply a cache replacement policy, the policy being based at least in part on set CL levels for the CLs.
-
公开(公告)号:US20220414054A1
公开(公告)日:2022-12-29
申请号:US17304797
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Jorge Parra , Jiasheng Chen , Supratim Pal , Fangwen Fu , Sabareesh Ganapathy , Chandra Gurram , Chunhui Mei , Yue Qi
Abstract: A processing apparatus described herein includes a general-purpose parallel processing engine comprising a systolic array having multiple pipelines, each of the multiple pipelines including multiple pipeline stages, wherein the multiple pipelines include a first pipeline, a second pipeline, and a common input shared between the first pipeline and the second pipeline.
-
公开(公告)号:US20220318013A1
公开(公告)日:2022-10-06
申请号:US17212588
申请日:2021-03-25
Applicant: Intel Corporation
Inventor: Naveen Mellempudi , Subramaniam Maiyuran , Varghese George , Fangwen Fu , Shuai Mu , Supratim Pal , Wei Xiong
Abstract: An apparatus to facilitate supporting 8-bit floating point format operands in a computing architecture is disclosed. The apparatus includes a processor comprising: a decoder to decode an instruction fetched for execution into a decoded instruction, wherein the decoded instruction is a matrix instruction that operates on 8-bit floating point operands to cause the processor to perform a parallel dot product operation; a controller to schedule the decoded instruction and provide input data for the 8-bit floating point operands in accordance with an 8-bit floating data format indicated by the decoded instruction; and systolic dot product circuitry to execute the decoded instruction using systolic layers, each systolic layer comprises one or more sets of interconnected multipliers, shifters, and adder, each set of multipliers, shifters, and adders to generate a dot product of the 8-bit floating point operands.
-
公开(公告)号:US20210374896A1
公开(公告)日:2021-12-02
申请号:US17159708
申请日:2021-01-27
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Stanley J. Baran , Sang-Hee Lee , Atthar H. Mohammed , Jong Dae Oh , Hiu-Fai R. Chan , Jill M. Boyce , Fangwen Fu , Satya N. Yedidi , Sumit Mohan , James M. Holland , Keith W. Rowe , Altug Koker
IPC: G06T1/20 , G06T1/60 , G09G5/00 , H04N19/156 , G06F1/3206 , G06F1/3234
Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, a graphics subsystem communicatively coupled to the application processor, a power budget analyzer to identify a power budget for one or more of the application processor, the persistent storage media, and the graphics subsystem, a target analyzer communicatively coupled to the graphics subsystem to identify a target for the graphics subsystem, and a parameter adjuster to adjust one or more parameters of the graphics subsystem based on one or more of the identified power budget and the identified target.
-
70.
公开(公告)号:US11051038B2
公开(公告)日:2021-06-29
申请号:US16707485
申请日:2019-12-09
Applicant: Intel Corporation
Inventor: Jill M. Boyce , Sumit Mohan , James M. Holland , Sang-Hee Lee , Abhishek R. Appu , Wen-Fu Kao , Joydeep Ray , Ya-Ti Peng , Keith W. Rowe , Fangwen Fu , Satya N. Yedidi
IPC: H04N19/593 , H04N19/597 , G06T11/00 , H04N19/52 , H04N19/176 , H04N19/105 , H04N19/436 , H04N19/46 , H04N19/136
Abstract: An embodiment of an electronic processing system may include a 2D frame which corresponds to a projection of a 360 video space, and a component predictor to predict an encode component for a first block of a 2D frame based on encode information from a neighboring block which is neighboring to the first block of the 2D frame only in the 360 video space, a prioritizer to prioritize transmission for a second block of the 2D frame based on an identified region of interest, and/or a format detector to detect a 360 video format of the 2D frame based on image content. A 360 video capture device may include a contextual tagger to tag 360 video content with contextual information which is contemporaneous with the captured 360 video content. Other embodiments are disclosed and claimed.
-
-
-
-
-
-
-
-
-