-
公开(公告)号:WO2020190797A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022835
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: KOKER, Altug , RAY, Joydeep , ANANTARAMAN, Aravindh , ANDREI, Valentin , APPU, Abhishek , COLEMAN, Sean , GALOPPO VON BORRIES, Nicolas , GEORGE, Varghese , K, Pattabhiraman , KIM, SungYe , MACPHERSON, Mike , MAIYURAN, Subramaniam , OULD-AHMED-VALL, Elmoustapha , RANGANATHAN, Vasanth , VALERIO, James
IPC: G06F12/0811 , G06F12/0875
Abstract: Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. A graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) (2810) having a first memory (2870-1), a first memory side cache memory (2880-1), a first communication fabric (2860-1), and a first memory management unit (MMU) (2855-1). The graphics processor includes a second GPU (2820) having a second memory (2870-2), a second memory side cache memory (2880-2), a second MMU (2855-2), and a second communication fabric (2860-2) that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.
-
公开(公告)号:WO2020190812A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022850
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: RANGANATHAN, Vasanth , APPU, Abhishek R. , ASHBAUGH, Ben , DOYLE, Peter , FLIFLET, Brandon , HUNTER, Arthur , INSKO, Brent , JANUS, Scott , KOKER, Altug , NAVALE, Aditya , RAY, Joydeep , SINHA, Kamal , STRIRAMASSARMA, Lakshminarayanan , SURTI, Prasoonkumar , VALERIO, James
IPC: G06F9/50
Abstract: Embodiments are generally directed to compute optimization in graphics processing. An embodiment of an apparatus includes one or more processors including a multi-tile graphics processing unit (GPU) to process data, the multi-tile GPU including multiple processor tiles; and a memory for storage of data for processing, wherein the apparatus is to receive compute work for processing by the GPU, partition the compute work into multiple work units, assign each of multiple work units to one of the processor tiles, and process the compute work using the processor tiles assigned to the work units.
-
公开(公告)号:WO2020190805A1
公开(公告)日:2020-09-24
申请号:PCT/US2020/022843
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: APPU, Abhishek R. , KOKER, Altug , ANANTARAMAN, Aravindh , OULD-AHMED-VALL, ElMoustapha , ANDREI, Valentin , GALOPPO VON BORRIES, Nicolas , GEORGE, Varghese , MACPHERSON, Mike , MAIYURAN, Subramaniam , RAY, Joydeep , STRIRAMASSARMA, Lakshminarayanan , JANUS, Scott , INSKO, Brent , RANGANATHAN, Vasanth , SINHA, Kamal , HUNTER, Arthur , SURTI, Prasoonkumar , PUFFER, David , VALERIO, James , SHAH, Ankur N.
IPC: G06F12/0811 , G06F12/0875 , G06F16/27 , G06F12/0866 , G06F16/2453
Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, an apparatus comprises a cache memory, a high-bandwidth memory, a shader core communicatively coupled to the cache memory and comprising a processing element to decompress a first data element extracted from an in-memory database in the cache memory and having a first bit length to generate a second data element having a second bit length, greater than the first bit length, and an arithmetic logic unit (ALU) to compare the data element to a target value provided in a query of the in-memory database. Other embodiments are also disclosed and claimed.
-
公开(公告)号:EP3920029A1
公开(公告)日:2021-12-08
申请号:EP20208365.5
申请日:2020-11-18
Applicant: INTEL Corporation
Inventor: JIANG, Hong , GANAPATHY, Sabareesh , TIAN, Xinmin , FU, Fangwen , VALERIO, James
IPC: G06F9/52
Abstract: Examples described herein relate to a graphics processing apparatus that includes a memory device and a graphics processing unit (GPU) coupled to the memory device, the GPU can be configured to: execute an instruction thread; determine if a signal barrier is associated with the instruction thread; for a signal barrier associated with the instruction thread, determine if the signal barrier is cleared; and based on the signal barrier being cleared, permit any waiting instruction thread associated with the signal barrier identifier to commence with execution but not permit any waiting thread that is not associated with the signal barrier identifier to commence with execution. In some examples, the signal barrier includes a signal barrier identifier. In some examples, the signal barrier identifier is one of a plurality of values. In some examples, a gateway is used to receive indications of a signal barrier identifier and to selectively clear a signal barrier for a waiting instruction thread associated with the signal barrier identifier based on clearance conditions associated with the signal barrier being met.
-
公开(公告)号:EP3938915A1
公开(公告)日:2022-01-19
申请号:EP20719251.9
申请日:2020-03-14
Applicant: INTEL Corporation
Inventor: APPU, Abhishek R. , KOKER, Altug , ANANTARAMAN, Aravindh , OULD-AHMED-VALL, ElMoustapha , ANDREI, Valentin , GALOPPO VON BORRIES, Nicolas , GEORGE, Varghese , MACPHERSON, Mike , MAIYURAN, Subramaniam , RAY, Joydeep , STRIRAMASSARMA, Lakshminarayanan , JANUS, Scott , INSKO, Brent , RANGANATHAN, Vasanth , SINHA, Kamal , HUNTER, Arthur , SURTI, Prasoonkumar , PUFFER, David , VALERIO, James , SHAH, Ankur N.
IPC: G06F12/0811 , G06F12/0875 , G06F16/27 , G06F12/0866 , G06F16/2453
-
6.
公开(公告)号:EP3591519A1
公开(公告)日:2020-01-08
申请号:EP19180125.7
申请日:2019-06-13
Applicant: Intel Corporation
Inventor: APODACA, Michael , SHAH, Ankur , ASHBAUGH, Ben , FLIFLET, Brandon , NALLURI, Hema , K, Pattabhiraman , DOYLE, Peter , KOSTON, Joseph , VALERIO, James , RAMADOSS, Murali , KOKER, Altug , NAVALE, Aditya , SURTI, Prasoonkumar , VEMBU, Balaji
IPC: G06F9/38
Abstract: Apparatus and method for simultaneous command streamers. For example, one embodiment of an apparatus comprises: a plurality of work element queues to store work elements for a plurality of thread contexts, each work element associated with a context descriptor identifying a context storage region in memory; a plurality of command streamers, each command streamer associated with one of the plurality of work element queues, the command streamers to independently submit instructions for execution as specified by the work elements; a thread dispatcher to evaluate the thread contexts including priority values, to tag each instruction with an execution identifier (ID), and to responsively dispatch each instruction including the execution ID in accordance with the thread context; and a plurality of graphics functional units to independently execute each instruction dispatched by the thread dispatcher and to associate each instruction with a thread context based on its execution ID.
-
-
-
-
-