-
公开(公告)号:US20220156202A1
公开(公告)日:2022-05-19
申请号:US17590362
申请日:2022-02-01
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Elmoustapha Ould-Ahmed-Vall , Abhishek Appu , Aravindh Anantaraman , Valentin Andrei , Durgaprasad Bilagi , Varghese George , Brent Insko , Sanjeev Jahagirdar , Scott Janus , Pattabhiraman K , SungYe Kim , Subramaniam Maiyuran , Vasanth Ranganathan , Lakshminarayanan Striramassarma , Xinmin Tian
IPC: G06F12/123 , G06F12/0891 , G06T1/60 , G06F12/0875
Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
-
公开(公告)号:US20220129521A1
公开(公告)日:2022-04-28
申请号:US17428233
申请日:2020-03-14
Applicant: INTEL CORPORATION
Inventor: Prasoonkumar Surti , Subramaniam Maiyuran , Valentin Andrei , Abhishek Appu , Varghese George , Altug Koker , Mike Macpherson , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , Joydeep Ray , Lakshminarayanan Striramassarma , SungYe Kim
Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides techniques to optimize training and inference on a systolic array when using sparse data. One embodiment provides techniques to use decompression information when performing sparse compute operations. One embodiment enables the disaggregation of special function compute arrays via a shared reg file. One embodiment enables packed data compress and expand operations on a GPGPU. One embodiment provides techniques to exploit block sparsity within the cache hierarchy of a GPGPU.
-
53.
公开(公告)号:US11194722B2
公开(公告)日:2021-12-07
申请号:US15922809
申请日:2018-03-15
Applicant: Intel Corporation
Inventor: Bharath Narasimha Swamy , Joydeep Ray , Rama Kishan Malladi , James Valerio , Abhishek Appu
IPC: G06F12/0842 , G06F12/0855
Abstract: Apparatus and method for improved cache utilization and efficiency on a many-core processor. An apparatus comprising: a plurality of execution units to generate cache access requests responsive to executing instructions; a pending request queue to store pending cache access requests generated by the execution units; pending queue management circuitry to compare a current cache access request with entries in the pending request queue to determine whether the current cache access request can be merged with an entry in the pending request queue and, if so, to merge the current cache access request with the entry.
-
公开(公告)号:US11175949B2
公开(公告)日:2021-11-16
申请号:US16506730
申请日:2019-07-09
Applicant: Intel Corporation
Inventor: Kiran C. Veernapu , Kamlesh Pillai , James Valerio , Joydeep Ray , Abhishek Appu
Abstract: A mechanism is described to facilitate microcontroller-based flexible thread scheduling launching in computing environments. An apparatus of embodiments, as described herein, includes facilitating a graphics processor hosting a microcontroller having a thread scheduling unit, and detection and observation logic to detect a scheduling algorithm associated with an application at the apparatus. The apparatus may further include reading and dispatching logic to facilitate the microcontroller to prepare a flexible dispatch routine based on the scheduling algorithm. The apparatus may further include scheduling and launching logic to facilitate the thread scheduling unit to dynamically schedule and launch threads based on the flexible dispatch routine, where the threads are hosted by the graphics processor.
-
公开(公告)号:US11113784B2
公开(公告)日:2021-09-07
申请号:US17064427
申请日:2020-10-06
Applicant: Intel Corporation
Inventor: Joydeep Ray , Scott Janus , Varghese George , Subramaniam Maiyuran , Altug Koker , Abhishek Appu , Prasoonkumar Surti , Vasanth Ranganathan , Andrei Valentin , Ashutosh Garg , Yoav Harel , Arthur Hunter, Jr. , SungYe Kim , Mike Macpherson , Elmoustapha Ould-Ahmed-Vall , William Sadler , Lakshminarayanan Striramassarma , Vikranth Vemulapalli
Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to skip computational operations for zero filled matrices and sub-matrices. Embodiments additionally provide techniques to maintain data compression through to a processing unit. Embodiments additionally provide an architecture for a sparse aware logic unit.
-
-
-
-