-
公开(公告)号:US20250110733A1
公开(公告)日:2025-04-03
申请号:US18477865
申请日:2023-09-29
Applicant: Intel Corporation
Inventor: Jorge Eduardo Parra Osorio , Fangwen Fu , Guei-Yuan Lueh , Jiasheng Chen , Naveen K. Mellempudi , Kevin Hurd , Alexandre Hadj-Chaib , Elliot Taylor , Marius Cornea-Hasegan
Abstract: An apparatus to facilitate conversion operations and special value use cases supporting 8-bit floating point format in a graphics architecture is disclosed. The apparatus includes a processor comprising a decoder to decode an instruction fetched for execution into a decoded instruction, wherein the decoded instruction to cause the processor to perform conversion operation corresponding to an 8-bit floating point format operand; a scheduler to schedule the decoded instruction and provide input data for an input operand of the conversion operation indicated by the decoded instruction; and conversion circuitry to execute the decoded instruction to perform the conversion operation to convert the input operand to an output operand in accordance with the 8-bit floating point format operand, the conversion circuitry comprising hardware circuitry to rescale, normalize, and convert the input operand to the output operand.
-
公开(公告)号:US12189571B2
公开(公告)日:2025-01-07
申请号:US17304797
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Jorge Parra , Jiasheng Chen , Supratim Pal , Fangwen Fu , Sabareesh Ganapathy , Chandra Gurram , Chunhui Mei , Yue Qi
Abstract: A processing apparatus described herein includes a general-purpose parallel processing engine comprising a systolic array having multiple pipelines, each of the multiple pipelines including multiple pipeline stages, wherein the multiple pipelines include a first pipeline, a second pipeline, and a common input shared between the first pipeline and the second pipeline.
-
13.
公开(公告)号:US20240169021A1
公开(公告)日:2024-05-23
申请号:US18056930
申请日:2022-11-18
Applicant: Intel Corporation
Inventor: Jorge Eduardo Parra Osorio , Supratim Pal , Fangwen Fu , Guei-Yuan Lueh , Po-Yu Chen , Jiasheng Chen
CPC classification number: G06F17/16 , G06F7/5443
Abstract: An apparatus to facilitate enhancements for accumulator usage and instruction forwarding in matrix multiply pipeline in graphics environment is disclosed. The apparatus includes matrix acceleration hardware comprising a plurality of data processing units, wherein the respective plurality of data processing units comprise: multiply-accumulate hardware to generate intermediate results of a matrix multiplication operation; intermediate accumulation hardware to store the intermediate results of the matrix multiplication operation and accumulate with other intermediate results generated by the multiply-accumulate hardware; a bypass data structure to cause a source operand to bypass the multiply-accumulate hardware; and an adder circuit to add an output from the multiply-accumulate hardware with at least one of the source operand or an output of the intermediate accumulation hardware to generate a final output.
-
公开(公告)号:US20240134719A1
公开(公告)日:2024-04-25
申请号:US17973234
申请日:2022-10-24
Applicant: Intel Corporation
Inventor: Fangwen Fu , Chunhui Mei , John A. Wiegert , Yongsheng Liu , Ben J. Ashbaugh
CPC classification number: G06F9/522 , G06F9/4881
Abstract: Embodiments described herein provide a technique to facilitate the synchronization of workgroups executed on multiple graphics cores of a graphics core cluster. One embodiment provides a graphics core including a cache memory and a graphics core coupled with the cache memory. The graphics core includes execution resources to execute an instruction via a plurality of hardware threads and barrier circuitry to synchronize execution of the plurality of hardware threads, wherein the barrier circuitry is configured to provide a plurality of re-usable named barriers.
-
公开(公告)号:US20230289399A1
公开(公告)日:2023-09-14
申请号:US18163418
申请日:2023-02-02
Applicant: Intel Corporation
Inventor: Joydeep Ray , Fangwen Fu , Dhiraj D. Kalamkar , Sasikanth Avancha
Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
-
公开(公告)号:US20210103550A1
公开(公告)日:2021-04-08
申请号:US17122905
申请日:2020-12-15
Applicant: Intel Corporation
Inventor: Abhishek Appu , Subramaniam Maiyuran , Mike Macpherson , Fangwen Fu , Jiasheng Chen , Varghese George , Vasanth Ranganathan , Ashutosh Garg , Joydeep Ray
Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides for data aware sparsity via compressed bitstreams. One embodiment provides for block sparse dot product instructions. One embodiment provides for a depth-wise adapter for a systolic array.
-
公开(公告)号:US20200288152A1
公开(公告)日:2020-09-10
申请号:US16647998
申请日:2017-12-05
Applicant: INTEL CORPORATION
Inventor: James Holland , Hiu-Fai Chan , Fangwen Fu , Qian Xu , Sang-Hee Lee , Vidhya Krishnan
IPC: H04N19/182 , H04N19/423
Abstract: A lossless pixel compressor may include technology to detect a format of a pixel memory region, and compress the pixel memory region together with embedded control information which indicates the detected format of the pixel memory region. Other embodiments are disclosed and claimed.
-
公开(公告)号:US10715818B2
公开(公告)日:2020-07-14
申请号:US15483146
申请日:2017-04-10
Applicant: INTEL CORPORATION
Inventor: James M. Holland , Fangwen Fu , Satya N. Yedidi , Srinivasan Embar Raghukrishnan
IPC: H04N19/176 , H04N19/523 , H04N19/146 , H04N19/103 , H04N19/43
Abstract: An apparatus of video encoding is described herein. The apparatus includes an encoder and a hardware bit packing unit. The encoder comprises at least a fixed function dual hierarchical motion estimation search units, dual integer motion estimation search units, and a fractional motion estimation search unit. Moreover, the hardware bit packing unit is to pack bits as coded according to the final macroblock coding decision into a data format.
-
公开(公告)号:US10554977B2
公开(公告)日:2020-02-04
申请号:US15859100
申请日:2017-12-29
Applicant: INTEL CORPORATION
Inventor: Fangwen Fu , Iole Moccagatta
IPC: H04N19/13 , H04N19/176 , H04N19/182
Abstract: A method, system, and articles of high throughput arithmetic entropy coding for video coding uses a non-framewidth raster order or non-raster order to form spatial neighbor probability contexts for entropy coding.
-
公开(公告)号:US10542279B2
公开(公告)日:2020-01-21
申请号:US15714808
申请日:2017-09-25
Applicant: Intel Corporation
Inventor: Fangwen Fu , Jill M. Boyce
IPC: H04N19/52 , H04N19/105 , H04N19/70 , H04N19/82
Abstract: Temporal motion vector prediction control is described in video coding. In one example, a method includes receiving a plurality of frames representing encoded video, parsing an uncompressed header for each frame, determining whether a temporal motion vector prediction command is included within the parsed uncompressed header of a first frame, selecting a reference frame from a reference list of frames, retrieving motion vector information from the selected reference frame, performing temporal motion vector prediction on the first frame corresponding to the parsed uncompressed header if a temporal motion vector prediction command is included within the parsed header to form a motion predicted frame, applying a loop filter to the motion predicted frame, and rendering the frame as decoded video.
-
-
-
-
-
-
-
-
-