-
公开(公告)号:US12204487B2
公开(公告)日:2025-01-21
申请号:US18415052
申请日:2024-01-17
Applicant: Intel Corporation
Inventor: Altug Koker , Varghese George , Aravindh Anantaraman , Valentin Andrei , Abhishek R. Appu , Niranjan Cooray , Nicolas Galoppo Von Borries , Mike MacPherson , Subramaniam Maiyuran , ElMoustapha Ould-Ahmed-Vall , David Puffer , Vasanth Ranganathan , Joydeep Ray , Ankur N. Shah , Lakshminarayanan Striramassarma , Prasoonkumar Surti , Saurabh Tangri
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06
Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.
-
公开(公告)号:US12153541B2
公开(公告)日:2024-11-26
申请号:US17674703
申请日:2022-02-17
Applicant: Intel Corporation
Inventor: Altug Koker , Lakshminarayanan Striramassarma , Aravindh Anantaraman , Valentin Andrei , Abhishek R. Appu , Sean Coleman , Varghese George , Pattabhiraman K , Mike MacPherson , Subramaniam Maiyuran , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , Joydeep Ray , Jayakrishna P S , Prasoonkumar Surti
IPC: G06F12/00 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/78 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06
Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
-
公开(公告)号:US12066975B2
公开(公告)日:2024-08-20
申请号:US17429291
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Altug Koker , Lakshminarayanan Striramassarma , Aravindh Anantaraman , Valentin Andrei , Abhishek R. Appu , Sean Coleman , Varghese George , K Pattabhiraman , Mike MacPherson , Subramaniam Maiyuran , ElMoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , Joydeep Ray , S Jayakrishna P , Prasoonkumar Surti
IPC: G06F12/00 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/78 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06
Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
-
公开(公告)号:US11954062B2
公开(公告)日:2024-04-09
申请号:US17310540
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: Joydeep Ray , Niranjan Cooray , Subramaniam Maiyuran , Altug Koker , Prasoonkumar Surti , Varghese George , Valentin Andrei , Abhishek Appu , Guadalupe Garcia , Pattabhiraman K , Sungye Kim , Sanjay Kumar , Pratik Marolia , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , William Sadler , Lakshminarayanan Striramassarma
IPC: G06F12/00 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/78 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06
Abstract: Embodiments described herein provide techniques to enable the dynamic reconfiguration of memory on a general-purpose graphics processing unit. One embodiment described herein enables dynamic reconfiguration of cache memory bank assignments based on hardware statistics. One embodiment enables for virtual memory address translation using mixed four kilobyte and sixty-four kilobyte pages within the same page table hierarchy and under the same page directory. One embodiment provides for a graphics processor and associated heterogenous processing system having near and far regions of the same level of a cache hierarchy.
-
公开(公告)号:US20240086357A1
公开(公告)日:2024-03-14
申请号:US18516716
申请日:2023-11-21
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Aravindh Anantaraman , Valentin Andrei , Abhishek Appu , Sean Coleman , Nicolas Galoppo Von Borries , Varghese George , Pattabhiraman K , SungYe Kim , Mike Macpherson , Subramaniam Maiyuran , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , James Valerio
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06T15/06
Abstract: Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a first memory, a first memory side cache memory, a first communication fabric, and a first memory management unit (MMU). The graphics processor includes a second graphics processing unit (GPU) having a second memory, a second memory side cache memory, a second memory management unit (MMU), and a second communication fabric that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.
-
公开(公告)号:US11869113B2
公开(公告)日:2024-01-09
申请号:US17544826
申请日:2021-12-07
Applicant: Intel Corporation
Inventor: Aravindh Anantaraman , Altug Koker , Varghese George , Subramaniam Maiyuran , SungYe Kim , Valentin Andrei
CPC classification number: G06T1/20 , G06F9/3802 , G06F9/3877 , G06T1/60 , G06T15/005
Abstract: Apparatuses including general-purpose graphics processing units and graphics multiprocessors that exploit queues or transitional buffers for improved low-latency high-bandwidth on-die data retrieval are disclosed. In one embodiment, a graphics multiprocessor includes at least one compute engine to provide a request, a queue or transitional buffer, and logic coupled to the queue or transitional buffer. The logic is configured to cause a request to be transferred to a queue or transitional buffer for temporary storage without processing the request and to determine whether the queue or transitional buffer has a predetermined amount of storage capacity.
-
公开(公告)号:US11561828B2
公开(公告)日:2023-01-24
申请号:US17317387
申请日:2021-05-11
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Varghese George , Altug Koker , Aravindh Anantaraman , SungYe Kim , Valentin Andrei , Joydeep Ray
IPC: G06F9/48 , G06F9/38 , G06F12/0837 , G06F9/30 , G06F9/50
Abstract: Accelerated synchronization operations using fine grain dependency check are disclosed. A graphics multiprocessor includes a plurality of execution units and synchronization circuitry that is configured to determine availability of at least one execution unit. The synchronization circuitry to perform a fine grain dependency check of availability of dependent data or operands in shared local memory or cache when at least one execution unit is available.
-
公开(公告)号:US20220350751A1
公开(公告)日:2022-11-03
申请号:US17862739
申请日:2022-07-12
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC: G06F12/123 , G06F12/0875 , G06F12/0891 , G06T1/60
Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
-
公开(公告)号:US20220138104A1
公开(公告)日:2022-05-05
申请号:US17429291
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Altug Koker , Lakshminarayanan Striramassarma , Aravindh Anantaraman , Valentin Andrei , Abhishek R. Appu , Sean Coleman , Varghese Georgr , K. Pattabhiraman , Mike MacPherson , Subramaniam Maiyuran , ElMoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , Joydeep Ray , S. Jayakrishna P , Prasoonkumar Surti
IPC: G06F12/0862 , G06F12/0897 , G06F12/0875 , G06F12/0866 , G06F9/30 , G06T15/06
Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
-
公开(公告)号:US20220129265A1
公开(公告)日:2022-04-28
申请号:US17430574
申请日:2020-03-14
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Aravindh Anantaraman , Elmoustapha Ould-Ahmed-Vall , Joydeep Ray , Mike Macpherson , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Subramaniam Maiyuran , Vasanth Ranganathan , Jayakrishna P S , K Pattabhiraman , Sudhakar Kamma
Abstract: Methods and apparatus relating to techniques for data compression. In an example, an apparatus comprises a processor receive a data compression instruction for a memory segment; and in response to the data compression instruction, compress a sequence of identical memory values in response to a determination that the sequence of identical memory values has a length which exceeds a threshold. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-