-
公开(公告)号:US11221848B2
公开(公告)日:2022-01-11
申请号:US16582406
申请日:2019-09-25
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Varghese George , Joydeep Ray , Ashutosh Garg , Jorge Parra , Shubh Shah , Shubra Marwaha
Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a shared local memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive an instruction to initiate a matrix multiplication operation, write a first set of matrix data into a first set of registers, and share the first set of matrix data between the first processing resource and the second processing resource for use in the matrix multiplication operation. Other embodiments may be described and claimed.
-
公开(公告)号:US11221762B2
公开(公告)日:2022-01-11
申请号:US16274866
申请日:2019-02-13
Applicant: Intel Corporation
Inventor: Joydeep Ray , Varghese George , Inder M. Sodhi , Jeffrey R. Wilcox
IPC: G06F3/06 , G06F12/02 , G06F12/06 , G11C5/04 , G06F1/3287 , G06F12/0811 , G06F15/78
Abstract: A processor includes a first memory interface to be coupled to a plurality of memory module sockets located off-package, a second memory interface to be coupled to a non-volatile memory (NVM) socket located off-package, and a multi-level memory controller (MLMC). The MLMC is to: control the memory modules disposed in the plurality of memory module sockets as main memory in a one-level memory (1LM) configuration; detect a switch from a 1LM mode of operation to a two-level memory (2LM) mode of operation in response to a basic input/output system (BIOS) detection of a low-power memory module disposed in one of the memory module sockets and a NVM device disposed in the NVM socket in a 2LM configuration; and control the low-power memory module as cache in the 2LM configuration in response to detection of the switch from the 1LM mode of operation to the 2LM mode of operation.
-
公开(公告)号:US11204977B2
公开(公告)日:2021-12-21
申请号:US16913800
申请日:2020-06-26
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Supratim Pal , Ashutosh Garg , Shubra Marwaha , Chandra Gurram , Darin Starkey , Durgesh Borkar , Varghese George
Abstract: Described herein is an accelerator device including a host interface, a fabric interconnect coupled with the host interface, and one or more hardware tiles coupled with the fabric interconnect, the one or more hardware tiles including sparse matrix multiply acceleration hardware including a systolic array with feedback inputs.
-
公开(公告)号:US11036545B2
公开(公告)日:2021-06-15
申请号:US16355565
申请日:2019-03-15
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Varghese George , Altug Koker , Aravindh Anantaraman , SungYe Kim , Valentin Andrei , Joydeep Ray
IPC: G06F9/38 , G06F9/30 , G06F9/50 , G06F9/48 , G06F12/0837
Abstract: Accelerated synchronization operations using fine grain dependency check are disclosed. A graphics multiprocessor includes a plurality of execution units and synchronization circuitry that is configured to determine availability of at least one execution unit. The synchronization circuitry to perform a fine grain dependency check of availability of dependent data or operands in shared local memory or cache when at least one execution unit is available.
-
公开(公告)号:US20210103550A1
公开(公告)日:2021-04-08
申请号:US17122905
申请日:2020-12-15
Applicant: Intel Corporation
Inventor: Abhishek Appu , Subramaniam Maiyuran , Mike Macpherson , Fangwen Fu , Jiasheng Chen , Varghese George , Vasanth Ranganathan , Ashutosh Garg , Joydeep Ray
Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides for data aware sparsity via compressed bitstreams. One embodiment provides for block sparse dot product instructions. One embodiment provides for a depth-wise adapter for a systolic array.
-
公开(公告)号:US10970808B2
公开(公告)日:2021-04-06
申请号:US16449545
申请日:2019-06-24
Applicant: Intel Corporation
Inventor: Joydeep Ray , Subramaniam Maiyuran , Varghese George , Vivek Kumar Ilanchelian
Abstract: A general-purpose graphics processor comprising a first set of compute units, a second set of compute units, and a memory coupled with the first set of compute units and the second set of compute units is described. The memory is configured to merge a first read request to an address block of the memory with a second read request to the address block of the memory to reduce a number of memory accesses to a memory bank associated with the address block. The graphics processor can also include a memory arbiter that can multicast merged reads to the compute units associated with the merged reads.
-
公开(公告)号:US20210072955A1
公开(公告)日:2021-03-11
申请号:US16562979
申请日:2019-09-06
Applicant: Intel Corporation
Inventor: Naveen MELLEMPUDI , Dipankar DAS , Chunhui MEI , Kristopher WONG , Dhiraj D. KALAMKAR , Hong H. JIANG , Subramaniam Maiyuran , Varghese George
Abstract: An apparatus to facilitate a computer number format conversion is disclosed. The apparatus comprises a control unit to receive to receive data format information indicating a first precision data format that input data is to be received and converter hardware to receive the input data and convert the first precision data format to a second precision data format based on the data format information.
-
公开(公告)号:US10861225B2
公开(公告)日:2020-12-08
申请号:US16234463
申请日:2018-12-27
Applicant: Intel Corporation
Inventor: Jill Boyce , Soethiha Soe , Selvakumar Panneer , Adam Lake , Nilesh Jain , Deepak Vembar , Glen J. Anderson , Varghese George , Carl Marshall , Scott Janus , Saurabh Tangri , Karthik Veeramani , Prasoonkumar Surti
Abstract: Embodiments are directed to neural network processing for multi-object three-dimensional (3D) modeling. An embodiment of a computer-readable storage medium includes executable computer program instructions for obtaining data from multiple cameras, the data including multiple images, and generating a 3D model for 3D imaging based at least in part on the data from the cameras, wherein generating the 3D model includes one or more of performing processing with a first neural network to determine temporal direction based at least in part on motion of one or more objects identified in an image of the multiple images or performing processing with a second neural network to determine semantic content information for an image of the multiple images.
-
公开(公告)号:US20200294181A1
公开(公告)日:2020-09-17
申请号:US16355377
申请日:2019-03-15
Applicant: Intel Corporation
Inventor: Naveen Matam , Lance Cheney , Eric Finley , Varghese George , Sanjeev Jahagirdar , Altug Koker , Josh Mastronarde , Iqbal Rajwani , Lakshminarayanan Striramassarma , Melaku Teshome , Vikranth Vemulapalli , Binoj Xavier
Abstract: Embodiments described herein provide techniques to disaggregate an architecture of a system on a chip integrated circuit into multiple distinct chiplets that can be packaged onto a common chassis. In one embodiment, a graphics processing unit or parallel processor is composed from diverse silicon chiplets that are separately manufactured. A chiplet is an at least partially packaged integrated circuit that includes distinct units of logic that can be assembled with other chiplets into a larger package. A diverse set of chiplets with different IP core logic can be assembled into a single device.
-
公开(公告)号:US20250156222A1
公开(公告)日:2025-05-15
申请号:US19009093
申请日:2025-01-03
Applicant: Intel Corporation
Inventor: Valentin Andrei , Subramaniam Maiyuran , SungYe Kim , Varghese George , Altug Koker , Aravindh Anantaraman
Abstract: Apparatuses to synchronize lanes that diverge or threads that drift are disclosed. In one embodiment, a graphics multiprocessor includes a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types. A regroup engine (or regroup circuitry) regroups threads into a third group having threads of the first instruction type and a fourth group having threads of the second instruction type.
-
-
-
-
-
-
-
-
-