-
公开(公告)号:US10255656B2
公开(公告)日:2019-04-09
申请号:US15798574
申请日:2017-10-31
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. Macpherson , John C. Weast , Feng Chen , Farshad Akhbari , Narayan Srinivasa , Nadathur Rajagopalan Satish , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes sorting logic to sort processing threads into thread groups based on bit depth of floating point thread operations.
-
公开(公告)号:US10186011B2
公开(公告)日:2019-01-22
申请号:US15581182
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao
Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.
-
公开(公告)号:US20180314249A1
公开(公告)日:2018-11-01
申请号:US15581124
申请日:2017-04-28
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Altug Koker , Farshad Akhbari , Feng Chen , Dukhwan Kim , Narayan Srinivasa , Nadathur Rajagopalan Satish , Kamal Sinha , Joydeep Ray , Balaji Vembu , Mike B. Macpherson , Linda L. Hurd , Sanjeev Jahagirdar , Vasanth Ranganathan
CPC classification number: G06F9/5016 , G06F9/5061
Abstract: A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.
-
公开(公告)号:US20180308256A1
公开(公告)日:2018-10-25
申请号:US15494812
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , Balaji Vembu , Prasoonkumar Surti , Kamal Sinha , Nadathur Rajagopalan Satish , Narayan Srinivasa , Feng Chen , Dukhwan Kim , Farshad Akhbari
CPC classification number: G06T9/00 , G06N3/0445 , G06N3/0454 , G06N3/0481 , G06N3/063 , G06N3/084
Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.
-
公开(公告)号:US20180307983A1
公开(公告)日:2018-10-25
申请号:US15494948
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Narayan Srinivasa , Joydeep Ray , Nicolas C. Galoppo Von Borries , Ben Ashbaugh , Prasoonkumar Surti , Feng Chen , Barath Lakshmanan , Elmoustapha Ould-Ahmed-Vall , Liwei Ma , Linda L. Hurd , Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Chandrasekaran Sakthivel , Farshad Akhbari , Dukhwan Kim , Altug Koker , Nadathur Rajagopalan Satish
CPC classification number: G06N3/08 , G06N3/04 , G06N3/0454 , G06N3/063 , G06N3/082
Abstract: An apparatus to facilitate optimization of a neural network (NN) is disclosed. The apparatus includes optimization logic to define a NN topology having one or more macro layers, adjust the one or more macro layers to adapt to input and output components of the NN and train the NN based on the one or more macro layers.
-
公开(公告)号:US20180300246A1
公开(公告)日:2018-10-18
申请号:US15489149
申请日:2017-04-17
Applicant: Intel Corporation
Inventor: Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker
IPC: G06F12/0837 , G06N3/08 , G06N99/00
CPC classification number: G06F12/0837 , G06F2212/62 , G06N3/08 , G06N99/005 , G06T1/20
Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20170358129A1
公开(公告)日:2017-12-14
申请号:US15525023
申请日:2014-12-08
Applicant: Intel Corporation
Inventor: Feng Chen , Yi Yang , Xiaoming Chen
CPC classification number: G06T15/80 , G06F9/30014 , G06F9/3861 , G06T1/20 , G06T15/005
Abstract: One or more system, apparatus, method, and computer readable media is described below for automated data type precision control capable of improving rendering quality on a graphics processor. Perceptible rendering quality is dependent at least in part on number format precision (e.g., FP16 or FP32) employed for shader program variables. In accordance with embodiments, shader variables implemented in lower precision data formats are tracked during shader compile to identify those that might trigger a floating point overflow and/or underflow exception. For shaders including one or more such variable, resources are provided to automatically monitor overflow and/or underflow exceptions during shader execution. In further embodiments, shader code is automatically re-generated based, at least in part, upon occurrences of such exceptions, and an increased number format precision specified for one or more of the tracked shader variables.
-
公开(公告)号:US20170228170A1
公开(公告)日:2017-08-10
申请号:US15479619
申请日:2017-04-05
Applicant: Intel Corporation
Inventor: Feng Chen , Michael P. Mesnier
IPC: G06F3/06
CPC classification number: G06F3/0611 , G06F3/065 , G06F3/0656 , G06F3/0679 , G06F12/0238 , G06F12/0246 , G06F12/1009 , G06F12/14 , G06F12/1433 , G06F2212/7205
Abstract: Storage class memory may be used in an architecture to achieve high performance, high reliability, high compatibility. In some embodiments, reads may be handled in a conventional way used in a memory based model. However writes do not use a memory based model but instead correspond to a storage based model. The hybrid nature can be achieved by setting the storage class memory to be write protected so that all writes must go through a software based block device interface. In some embodiments, the software based block device interface prevents erroneous writes to the storage class memory.
-
公开(公告)号:US12056788B2
公开(公告)日:2024-08-06
申请号:US17684187
申请日:2022-03-01
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. Macpherson , John C. Weast , Feng Chen , Farshad Akhbari , Narayan Srinivasa , Nadathur Rajagopalan Satish , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman
IPC: G06T1/20 , G06F3/14 , G06F9/30 , G06F9/38 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T15/00 , G09G5/36 , G06T15/04
CPC classification number: G06T1/20 , G06F3/14 , G06F9/3001 , G06F9/30014 , G06F9/3017 , G06F9/3887 , G06F9/3895 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06T15/005 , G09G5/363 , G06F9/3851 , G06T15/04 , G09G2360/06 , G09G2360/08 , G09G2360/121
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a mixed precision core including mixed-precision execution circuitry to execute one or more of the mixed-precision instructions to perform a mixed-precision dot-product operation comprising to perform a set of multiply and accumulate operations.
-
公开(公告)号:US20240257294A1
公开(公告)日:2024-08-01
申请号:US18436494
申请日:2024-02-08
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
CPC classification number: G06T1/20 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F8/41 , G06F2009/45583
Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
-
-
-
-
-
-
-
-
-