-
公开(公告)号:US20220350751A1
公开(公告)日:2022-11-03
申请号:US17862739
申请日:2022-07-12
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC: G06F12/123 , G06F12/0875 , G06F12/0891 , G06T1/60
Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
-
公开(公告)号:US20220138104A1
公开(公告)日:2022-05-05
申请号:US17429291
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Altug Koker , Lakshminarayanan Striramassarma , Aravindh Anantaraman , Valentin Andrei , Abhishek R. Appu , Sean Coleman , Varghese Georgr , K. Pattabhiraman , Mike MacPherson , Subramaniam Maiyuran , ElMoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , Joydeep Ray , S. Jayakrishna P , Prasoonkumar Surti
IPC: G06F12/0862 , G06F12/0897 , G06F12/0875 , G06F12/0866 , G06F9/30 , G06T15/06
Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
-
公开(公告)号:US20220137967A1
公开(公告)日:2022-05-05
申请号:US17429277
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Altug Koker , Varghese George , Aravindh Anantaraman , Valentin Andrel , Abhishek R. Appu , Niranjan Cooray , Nicolas Galoppo Von Borries , Mike MacPherson , Subramaniam Maiyuran , ElMoustapha Ould-Ahmed-Vall , David Puffer , Vasanth Ranganathan , Joydeep Ray , Ankur N. Shah , Lakshminarayanan Striramassarma , Prasoonkumar Surti , Saurabh Tangri
Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.
-
公开(公告)号:US20220114108A1
公开(公告)日:2022-04-14
申请号:US17428529
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K. , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC: G06F12/123 , G06F12/0891 , G06F12/0875 , G06T1/60
Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received.
-
公开(公告)号:US20220114096A1
公开(公告)日:2022-04-14
申请号:US17428527
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Lakshminarayanan Striramassarma , Prasoonkumar Surti , Varghese George , Ben Ashbaugh , Aravindh Anantaraman , Valentin Andrei , Abhishek Appu , Nicolas Galoppo Von Borries , Altug Koker , Mike Macpherson , Subramaniam Maiyuran , Nilay Mistry , Elmoustapha Ould-Ahmed-Vall , Selvakumar Panneer , Vasanth Ranganathan , Joydeep Ray , Ankur Shah , Saurabh Tangri
IPC: G06F12/0802
Abstract: Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi-GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.
-
公开(公告)号:US20210374897A1
公开(公告)日:2021-12-02
申请号:US17303654
申请日:2021-06-03
Applicant: Intel Corporation
Inventor: Joydeep Ray , Scott Janus , Varghese George , Subramaniam Maiyuran , Altug Koker , Abhishek Appu , Prasoonkumar Surti , Vasanth Ranganathan , Andrei Valentin , Ashutosh Garg , Yoav Harel , Arthur Hunter, JR. , SungYe Kim , Mike Macpherson , Elmoustapha Ould-Ahmed-Vall , William Sadler , Lakshminarayanan Striramassarma , Vikranth Vemulapalli
Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to skip computational operations for zero filled matrices and sub-matrices. Embodiments additionally provide techniques to maintain data compression through to a processing unit. Embodiments additionally provide an architecture for a sparse aware logic unit.
-
公开(公告)号:US11145105B2
公开(公告)日:2021-10-12
申请号:US16355364
申请日:2019-03-15
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Arthur Hunter, Jr. , Kamal Sinha , Scott Janus , Brent Insko , Vasanth Ranganathan , Lakshminarayanan Striramassarma
Abstract: Embodiments are generally directed to multi-tile graphics processor rendering. An embodiment of an apparatus includes a memory for storage of data; and one or more processors including a graphics processing unit (GPU) to process data, wherein the GPU includes a plurality of GPU tiles, wherein, upon geometric data being assigned to each of a plurality of screen tiles, the apparatus is to transfer the geometric data to the plurality of GPU tiles.
-
公开(公告)号:US20210191872A1
公开(公告)日:2021-06-24
申请号:US17191473
申请日:2021-03-03
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , David Puffer , Prasoonkumar Surti , Lakshminarayanan Striramassarma , Vasanth Ranganathan , Kiran C. Veernapu , Balaji Vembu , Pattabhiraman K
IPC: G06F12/0877 , G06F12/0802 , G06F12/0855 , G06F12/0806 , G06F12/0846 , G06F12/0868 , G06T1/60 , G06F12/126
Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20210056033A1
公开(公告)日:2021-02-25
申请号:US17026264
申请日:2020-09-20
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , David Puffer , Prasoonkumar Surti , Lakshminarayanan Striramassarma , Vasanth Ranganathan , Kiran C. Veernapu , Balaji Vembu , Pattabhiraman K
IPC: G06F12/0877 , G06F12/0802 , G06F12/0855 , G06F12/0806 , G06F12/0846 , G06F12/0868 , G06T1/60 , G06F12/126
Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20210056028A1
公开(公告)日:2021-02-25
申请号:US17068754
申请日:2020-10-12
Applicant: Intel Corporation
Inventor: JOYDEEP RAY , James Valerio , Ben Ashbaugh , Lakshminarayanan Striramassarma
IPC: G06F12/0811 , G06F3/06 , G06F9/38 , G06F9/54
Abstract: Embodiments described herein provide a general purpose graphics processor comprising a plurality of tiles, each tile of the plurality of tiles comprising at least one execution unit, a local cache, and a cache control unit, and a high bandwidth memory communicatively coupled to the plurality of tiles, wherein the high bandwidth memory is shared between the plurality of tiles. The cache control unit is to implement a partial write management protocol to receive a partial write operation directed to a cache line in the local cache, the partial write operation comprising write data, write the data associated with the partial write operation to the local cache when the cache line is in a modified state, and forward the write data associated with the partial write operation to the high bandwidth memory when the partial write operation triggers a cache miss or when the cache line is in an exclusive state or a shared state. Other embodiments may be described and claimed.
-
-
-
-
-
-
-
-
-