-
公开(公告)号:US11934934B2
公开(公告)日:2024-03-19
申请号:US15488551
申请日:2017-04-17
Applicant: Intel Corporation
Inventor: Liwei Ma , Elmoustapha Ould- Ahmed-Vall , Barath Lakshmanan , Ben J. Ashbaugh , Jingyi Jin , Jeremy Bottleson , Mike B. Macpherson , Kevin Nealis , Dhawal Srivastava , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Altug Koker , Abhishek R. Appu
Abstract: An apparatus to facilitate optimization of a convolutional neural network (CNN) is disclosed. The apparatus includes optimization logic to receive a CNN model having a list of instructions and including pruning logic to optimize the list of instructions by eliminating branches in the list of instructions that comprise a weight value of 0.
-
公开(公告)号:US20240086357A1
公开(公告)日:2024-03-14
申请号:US18516716
申请日:2023-11-21
Applicant: Intel Corporation
Inventor: Altug Koker , Joydeep Ray , Aravindh Anantaraman , Valentin Andrei , Abhishek Appu , Sean Coleman , Nicolas Galoppo Von Borries , Varghese George , Pattabhiraman K , SungYe Kim , Mike Macpherson , Subramaniam Maiyuran , Elmoustapha Ould-Ahmed-Vall , Vasanth Ranganathan , James Valerio
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46
CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06T15/06
Abstract: Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a first memory, a first memory side cache memory, a first communication fabric, and a first memory management unit (MMU). The graphics processor includes a second graphics processing unit (GPU) having a second memory, a second memory side cache memory, a second memory management unit (MMU), and a second communication fabric that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.
-
公开(公告)号:US20240070926A1
公开(公告)日:2024-02-29
申请号:US18466141
申请日:2023-09-13
Applicant: Intel Corporation
Inventor: Joydeep Ray , Ben Ashbaugh , Prasoonkumar Surti , Pradeep Ramani , Rama Harihara , Jerin C. Justin , Jing Huang , Xiaoming Cui , Timothy B. Costa , Ting Gong , Elmoustapha Ould-ahmed-vall , Kumar Balasubramanian , Anil Thomas , Oguz H. Elibol , Jayaram Bobba , Guozhong Zhuang , Bhavani Subramanian , Gokce Keskin , Chandrasekaran Sakthivel , Rajesh Poornachandran
CPC classification number: G06T9/002 , G06F12/023 , G06T15/005 , G06F2212/302 , G06F2212/401
Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
-
公开(公告)号:US20240069914A1
公开(公告)日:2024-02-29
申请号:US17893985
申请日:2022-08-23
Applicant: Intel Corporation
Inventor: Biju George , Fangwen Fu , Joydeep Ray
CPC classification number: G06F9/30036 , G06F9/30043 , G06F9/3455 , G06F9/3877
Abstract: Embodiments described herein provide a system to enable access to an n-dimensional tensor in memory of a graphics processor via a batch of two-dimensional block access messages. One embodiment provides a graphics processor comprising general-purpose graphics execution resources coupled with the system interface, the general-purpose graphics execution resources including a matrix accelerator. The matrix accelerator is configured to perform a matrix operation on a plurality of tensors stored in a memory. Circuitry is included to facilitate access to the memory by the general-purpose graphics execution resources. The circuitry is configured to receive a request to access a tensor of the plurality of tensors and generate a batch of two-dimensional block access messages along a dimension of n>2 of the tensor. The batch of two-dimensional block access messages enables access to the tensor by the matrix accelerator.
-
公开(公告)号:US11900665B2
公开(公告)日:2024-02-13
申请号:US18358067
申请日:2023-07-25
Applicant: Intel Corporation
Inventor: Barnan Das , Mayuresh M. Varerkar , Narayan Biswal , Stanley J. Baran , Gokcen Cilingir , Nilesh V. Shah , Archie Sharma , Sherine Abdelhak , Praneetha Kotha , Neelay Pandit , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Abhishek R. Appu , Altug Koker , Joydeep Ray
IPC: G06V10/82 , G06V40/10 , G06V10/94 , G06V10/764 , G06V40/20 , G06F16/783 , G06F16/583 , G06F18/2413 , G06V10/10
CPC classification number: G06V10/82 , G06F16/5838 , G06F16/784 , G06F18/24143 , G06V10/764 , G06V10/955 , G06V40/10 , G06V40/103 , G06V40/23
Abstract: A graphics processor can include a processing cluster array including a plurality of processing clusters coupled with the plurality of memory controllers, each processing cluster of the plurality of processing clusters including a plurality of streaming multiprocessors, the processing cluster array configured for partitioning into a plurality of partitions. The plurality of partitions include a first partition including a first plurality of streaming multiprocessors configured to perform operations for a first neural network, The operations for the first neural network are isolated to the first partition. The plurality of partitions also include a second partition including a second plurality of streaming multiprocessors configured to perform operations for a second neural network. The operations for the second neural network are isolated to the second partition and protected from operations performed for the first neural network.
-
公开(公告)号:US20230401668A1
公开(公告)日:2023-12-14
申请号:US18456235
申请日:2023-08-25
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , Altug Koker , Abhishek R. Appu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Ben J. Ashbaugh , Barath Lakshmanan , Liwei Ma , Joydeep Ray , Ping T. Tang , Michael S. Strickland
IPC: G06T1/20 , G06F7/483 , G06N3/084 , G06F9/30 , G06N3/063 , G06F9/50 , G06F9/38 , G06N3/044 , G06N3/045 , G06N20/00
CPC classification number: G06T1/20 , G06F7/483 , G06N3/084 , G06F9/30185 , G06F9/30014 , G06N3/063 , G06F9/5044 , G06F9/3863 , G06N3/044 , G06N3/045 , G06N20/00 , G06F3/14
Abstract: One embodiment provides a general-purpose graphics processing unit comprising a dynamic precision floating-point unit including a control unit having precision tracking hardware logic to track an available number of bits of precision for computed data relative to a target precision, wherein the dynamic precision floating-point unit includes computational logic to output data at multiple precisions.
-
公开(公告)号:US20230386417A1
公开(公告)日:2023-11-30
申请号:US18322665
申请日:2023-05-24
Applicant: Intel Corporation
Inventor: Arthur J. Runyan , Richmond Hicks , Nausheen Ansari , Narayan Biswal , Ya-Ti Peng , Abhishek R. Appu , Wen-Fu Kao , Sang-Hee Lee , Joydeep Ray , Changliang Wang , Satyanarayana Avadhanam , Scott Janus , Gary Smith , Nilesh V. Shah , Keith W. Rowe , Robert J. Johnston
IPC: G09G3/34 , G09G3/36 , B60R1/00 , G09G5/10 , G09G5/14 , G09G5/38 , G06F3/147 , G02B27/01 , G09G5/00 , B60R1/24
CPC classification number: G09G3/3406 , G09G3/3648 , B60R1/00 , G09G5/10 , G09G5/14 , G09G5/38 , G06F3/147 , G02B27/01 , G09G5/00 , B60R1/24 , G09G2360/144 , G09G2320/0626 , G09G2380/10 , B60R2300/205 , B60R2300/207 , B60R2300/302 , B60R2300/802 , B60K2370/21 , B60K2370/31 , B60K2370/34 , B60K2370/37 , B60K2370/52 , B60K2370/77 , B60K2370/152 , B60K2370/154 , B60K2370/155 , B60K2370/334 , G09G2340/0464 , G09G2354/00
Abstract: Often when there is a glare on a display screen the user may be able to mitigate the glare by tilting or otherwise moving the screen or changing their viewing position. However, when driving a car there are limited options for overcoming glares on the dashboard, especially when you are driving for a long distance in the same direction. Embodiments are directed to eliminating such glare. Other embodiments are related to mixed reality (MR) and filling in occluded areas.
-
公开(公告)号:US20230386130A1
公开(公告)日:2023-11-30
申请号:US18305511
申请日:2023-04-24
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Abhishek R. Appu , Subhajit Dasgupta , Srivallaba Mysore , Michael J. Norris , Vasanth Ranganathan , Joydeep Ray
CPC classification number: G06T15/80 , G06T1/20 , G06T1/60 , G06T15/005 , G06T2210/52
Abstract: One embodiment provides for a graphics processing unit comprising a processing cluster to perform multi-rate shading via coarse pixel shading and output shaded coarse pixels for processing by a post-shader pixel processing pipeline.
-
公开(公告)号:US11803934B2
公开(公告)日:2023-10-31
申请号:US17591152
申请日:2022-02-02
Applicant: Intel Corporation
Inventor: Balaji Vembu , Altug Koker , Joydeep Ray
CPC classification number: G06T1/20 , G06T15/005 , G06T2200/04
Abstract: One embodiment provides an apparatus comprising an interconnect fabric comprising one or more fabric switches, a plurality of memory interfaces coupled to the interconnect fabric to provide access to a plurality of memory devices, an input/output (IO) interface coupled to the interconnect fabric to provide access to IO devices, an array of multiprocessors coupled to the interconnect fabric, scheduling circuitry to distribute a plurality of thread groups across the array of multiprocessors, each thread group comprising a plurality of threads and each thread comprising a plurality of instructions to be executed by at least one of the multiprocessors, and a first multiprocessor of the array of multiprocessors to be assigned to process a first thread group comprising a first plurality of threads, the first multiprocessor comprising a plurality of parallel execution circuits.
-
公开(公告)号:US11748298B2
公开(公告)日:2023-09-05
申请号:US17826674
申请日:2022-05-27
Applicant: Intel Corporation
Inventor: Altug Koker , Farshad Akhbari , Feng Chen , Dukhwan Kim , Narayan Srinivasa , Nadathur Rajagopalan Satish , Liwei Ma , Jeremy Bottleson , Eriko Nurvitadhi , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu
IPC: G06F15/80 , G06F13/40 , G06T1/20 , G06F9/30 , G06F13/00 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/048
CPC classification number: G06F15/8007 , G06F9/3004 , G06F13/00 , G06F13/4027 , G06N3/044 , G06N3/045 , G06N3/048 , G06N3/063 , G06N3/084 , G06T1/20
Abstract: An integrated circuit (IC) package apparatus is disclosed. The IC package includes one or more processing units and a bridge, mounted below the one or more processing unit, including one or more arithmetic logic units (ALUs) to perform atomic operations.
-
-
-
-
-
-
-
-
-