-
公开(公告)号:WO2020190805A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022843
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: APPU, Abhishek R. , KOKER, Altug , ANANTARAMAN, Aravindh , OULD-AHMED-VALL, ElMoustapha , ANDREI, Valentin , GALOPPO VON BORRIES, Nicolas , GEORGE, Varghese , MACPHERSON, Mike , MAIYURAN, Subramaniam , RAY, Joydeep , STRIRAMASSARMA, Lakshminarayanan , JANUS, Scott , INSKO, Brent , RANGANATHAN, Vasanth , SINHA, Kamal , HUNTER, Arthur , SURTI, Prasoonkumar , PUFFER, David , VALERIO, James , SHAH, Ankur N.
IPC: G06F12/0811 , G06F12/0875 , G06F16/27 , G06F12/0866 , G06F16/2453
Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, an apparatus comprises a cache memory, a high-bandwidth memory, a shader core communicatively coupled to the cache memory and comprising a processing element to decompress a first data element extracted from an in-memory database in the cache memory and having a first bit length to generate a second data element having a second bit length, greater than the first bit length, and an arithmetic logic unit (ALU) to compare the data element to a target value provided in a query of the in-memory database. Other embodiments are also disclosed and claimed.
-
公开(公告)号:WO2020190810A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022848
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: KOKER, Altug , ASHBAUGH, Ben , JANUS, Scott , ANANTARAMAN, Aravindh , APPU, Abhishek R. , COORAY, Niran , GEORGE, Varghese , HUNTER, Arthur , INSKO, Brent , OULD-AHMED-VALL, ElMoustapha , PANNEER, Selvakumar , RANGANATHAN, Vasanth , RAY, Joydeep , SINHA, Kamal , STRIRAMASSARMA, Lakshminarayanan , SURTI, Prasoonkumar , TANGRI, Saurabh
IPC: G06F12/0804 , G06F12/0893 , G06F15/173
Abstract: Embodiments are generally directed to a multi-tile architecture for graphics operations. An embodiment of an apparatus includes a multi-tile architecture for graphics operations including a multi-tile graphics processor, the multi-tile processor includes one or more dies; multiple processor tiles installed on the one or more dies; and a structure to interconnect the processor tiles on the one or more dies, wherein the structure to enable communications between processor tiles the processor tiles.
-
公开(公告)号:WO2020190799A2
公开(公告)日:2020-09-24
申请号:PCT/US2020/022837
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: KOKER, Altug , RAY, Joydeep , ASHBAUGH, Ben , PEARCE, Jonathan , APPU, Abhishek , RANGANATHAN, Vasanth , STRIRAMASSARMA, Lakshminarayanan , OULD-AHMED-VALL, Elmoustapha , ANANTARAMAN, Aravindh , ANDREI, Valentin , GALOPPO VON BORRIES, Nicolas , GEORGE, Varghese , HAREL, Yoav , HUNTER, Arthur Jr. , INSKO, Brent , JANUS, Scott , K, Pattabhiraman , MACPHERSON, Mike , MAIYURAN, Subramaniam , PETRE, Marian Alin , RAMADOSS, Murali , SHAH, Shailesh , SINHA, Kamal , SURTI, Prasoonkumar , VEMULAPALLI, Vikranth
IPC: G06F9/38 , G06F12/0862 , G06F9/30 , G06F12/02 , G06F12/06 , G06F12/0804 , G06F12/0893 , G06F12/12 , G06F12/128 , G06F15/173 , G06F9/50
Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
-
公开(公告)号:WO2020190798A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022836
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: STRIRAMASSARMA, Lakshminarayanan , SURTI, Prasoonkumar , GEORGE, Varghese , ASHBAUGH, Ben , ANANTARAMAN, Aravindh , ANDREI, Valentin , APPU, Abhishek , GALOPPO VON BORRIES, Nicolas , KOKER, Altug , MACPHERSON, Mike , MAIYURAN, Subramaniam , MISTRY, Nilay , OULD-AHMED-VALL, Elmoustapha , PANNEER, Selvakumar , RANGANATHAN, Vasanth , RAY, Joydeep , SHAH, Ankur , TANGRI, Saurabh
IPC: G06F9/38 , G06F12/0862 , G06F9/30
Abstract: Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi- GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.
-
公开(公告)号:WO2020190369A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/014766
申请日:2020-01-23
Applicant: INTEL CORPORATION
Inventor: MATAM, Naveen , CHENEY, Lance , FINLEY, Eric , GEORGE, Varghese , JAHAGIRDAR, Sanjeev , KOKER, Altug , MASTRONARDE, Josh , RAJWANI, Iqbal , STRIRAMASSARMA, Lakshminarayanan , TESHOME, Melaku , VEMULAPALLI, Vikranth , XAVIER, Binoj
Abstract: Embodiments described herein provide techniques to disaggregate an architecture of a system on a chip integrated circuit into multiple distinct chiplets that can be packaged onto a common chassis. In one embodiment, a graphics processing unit or parallel processor is composed from diverse silicon chiplets that are separately manufactured. A chiplet is an at least partially packaged integrated circuit that includes distinct units of logic that can be assembled with other chiplets into a larger package. A diverse set of chiplets with different IP core logic can be assembled into a single device.
-
公开(公告)号:WO2020190812A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022850
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: RANGANATHAN, Vasanth , APPU, Abhishek R. , ASHBAUGH, Ben , DOYLE, Peter , FLIFLET, Brandon , HUNTER, Arthur , INSKO, Brent , JANUS, Scott , KOKER, Altug , NAVALE, Aditya , RAY, Joydeep , SINHA, Kamal , STRIRAMASSARMA, Lakshminarayanan , SURTI, Prasoonkumar , VALERIO, James
IPC: G06F9/50
Abstract: Embodiments are generally directed to compute optimization in graphics processing. An embodiment of an apparatus includes one or more processors including a multi-tile graphics processing unit (GPU) to process data, the multi-tile GPU including multiple processor tiles; and a memory for storage of data for processing, wherein the apparatus is to receive compute work for processing by the GPU, partition the compute work into multiple work units, assign each of multiple work units to one of the processor tiles, and process the compute work using the processor tiles assigned to the work units.
-
公开(公告)号:WO2020190808A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022846
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: RAY, Joydeep , JANUS, Scott , GEORGE, Varghese , MAIYURAN, Subramaniam , KOKER, Altug , APPU, Abhishek , SURTI, Prasoonkumar , RANGANATHAN, Vasanth , ANDREI, Valentin , GARG, Ashutosh , HAREL, Yoav , HUNTER, JR., Arthur , KIM, SungYe , MACPHERSON, Mike , OULD-AHMED-VALL, Elmoustapha , SADLER, William , STRIRAMASSARMA, Lakshminarayanan , VEMULAPALLI, Vikranth
Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to skip computational operations for zero filled matrices and sub-matrices. Embodiments additionally provide techniques to maintain data compression through to a processing unit. Embodiments additionally provide an architecture for a sparse aware logic unit.
-
公开(公告)号:WO2020190776A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022763
申请日:2020-03-13
Applicant: INTEL CORPORATION
Inventor: JANUS, Scott , KRISHNAN, Vidhya , NEMIROFF, Daniel , KUMAR, Gaurav , HUNTER, Arthur , INSKO, Brent , RANGANATHAN, Vasanth , SINHA, Kamal , STRIRAMASSARMA, Lakshminarayanan , SURTI, Prasoonkumar
IPC: G06F12/0811 , G06F12/0875 , G06F12/14 , G06F21/10 , H04L29/06
Abstract: An apparatus to facilitate synchronizing encrypted workloads across multiple graphics processing units is disclosed. The apparatus includes a memory and one or more processors of the plurality of GPUs, the one or more processors communicably coupled to the memory. The one or more processors are to receive a license associated with the encrypted workload, the license comprising a private content key corresponding to a secure compute application generating the encrypted workload, encrypt the private content key with a first key to generate a session key, the first key shared among graphic security controllers (GSCs) of the plurality of GPUs, and inject the session key into a region of the memory that is shared among the plurality of GPUs.
-
公开(公告)号:WO2020190370A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/014770
申请日:2020-01-23
Applicant: INTEL CORPORATION
Inventor: KOKER, Altug , CHENEY, Lance , FINLEY, Eric , GEORGE, Varghese , JAHAGIRDAR, Sanjeev , MASTRONARDE, Josh , MATAM, Naveen , RAJWANI, Iqbal , STRIRAMASSARMA, Lakshminarayanan , TESHOME, Melaku , VEMULAPALLI, Vikranth , XAVIER, Binoj
IPC: G06T1/20
Abstract: A disaggregated processor package can be configured to accept interchangeable chiplets. Interchangeability is enabled by specifying a standard physical interconnect for chiplets that can enable the chiplet to interface with a fabric or bridge interconnect. Chiplets from different IP designers can conform to the common interconnect, enabling such chiplets to be interchangeable during assembly. The fabric and bridge interconnects logic on the chiplet can then be configured to confirm with the actual interconnect layout of the onboard logic of the chiplet. Additionally, data from chiplets can be transmitted across an inter-chiplet fabric using encapsulation, such that the actual data being transferred is opaque to the fabric, further enable interchangeability of the individual chiplets. With such an interchangeable design, higher or lower density memory can be inserted into memory chiplet slots, while compute or graphics chiplets with a higher or lower core count can be inserted into logic chiplet slots.
-
公开(公告)号:WO2020190806A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022844
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: KOKER, Altug , GEORGE, Varghese , ANANTARAMAN, Aravindh , ANDREI, Valentin , APPU, Abhishek R. , COORAY, Niran , GALOPPO VON BORRIES, Nicolas , MACPHERSON, Mike , MAIYURAN, Subramaniam , OULD-AHMED-VALL, ElMoustapha , PUFFER, David , RANGANATHAN, Vasanth , RAY, Joydeep , SHAH, Ankur N. , STRIRAMASSARMA, Lakshminarayanan , SURTI, Prasoonkumar , TANGRI, Saurabh
IPC: G06F9/38 , G06F12/0862 , G06F9/30
Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.
-
-
-
-
-
-
-
-
-