-
81.
公开(公告)号:US20180315398A1
公开(公告)日:2018-11-01
申请号:US15787129
申请日:2017-10-18
Applicant: Intel Corporation
Inventor: Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar
CPC classification number: G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30036 , G06F9/3851 , G06F2207/3824 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/08 , G06N20/00 , G06T15/005 , G09G5/393
Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.
-
公开(公告)号:US20180308202A1
公开(公告)日:2018-10-25
申请号:US15495054
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , John C. Weast , Mike B. Macpherson , Linda L. Hurd , Sara S. Baghsorkhi , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Liwei Ma , Elmoustapha Ould-Ahmed-Vall , Kamal Sinha , Joydeep Ray , Balaji Vembu , Sanjeev Jahagirdar , Vasanth Ranganathan , DUKHWAN Kim
Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.
-
公开(公告)号:US20180307984A1
公开(公告)日:2018-10-25
申请号:US15494971
申请日:2017-04-24
Applicant: Intel Corporation
Inventor: Altug Koker , Abhishek R. Appu , Kamal Sinha , Joydeep Ray , Balaji Vembu , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , John C. Weast , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Farshad Akhbari , Nadathur Rajagopalan Satish , Liwei Ma , Jeremy Bottleson , Eriko Nurvitadhi , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy , Vasanth Ranganathan , Sanjeev Jahagirdar
CPC classification number: G06N3/08 , G06F9/28 , G06F9/505 , G06N3/0445 , G06N3/0454 , G06N3/0481 , G06N3/063 , G06N99/005
Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180300251A1
公开(公告)日:2018-10-18
申请号:US15488961
申请日:2017-04-17
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Joydeep Ray , Prasoonkumar Surti , Kamal Sinha , Kiran C. Veernapu , Balaji Vembu
IPC: G06F12/0888 , G06F13/42 , G06F13/40 , G06T1/20
CPC classification number: G06F12/0888 , G06F13/4022 , G06F13/4282 , G06F2212/1024 , G06F2212/6032 , G06F2213/0026 , G06T1/60
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180300074A1
公开(公告)日:2018-10-18
申请号:US15488723
申请日:2017-04-17
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Kamal Sinha , Bhushan M. Borole , Altug Koker , Joydeep Ray , Wenyin Fu
Abstract: Power for on-die heavily used local memories in general purpose graphics processing unit (GPGPU) applications may be reduced by using low latency read and high latency write operations. Power consumption in read heavy graphic operations can be reduced using a small memory footprint design with possible reduction of hot spotting in some embodiments.
-
公开(公告)号:US20180293102A1
公开(公告)日:2018-10-11
申请号:US15482801
申请日:2017-04-09
Applicant: Intel Corporation
Inventor: Joydeep Ray , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Rajkishore Barik , Eriko Nurvitadhi , Nicolas C. Galoppo Von Borries , Tsung-Han Lin , Sanjeev Jahagirdar , Vasanth Ranganathan
Abstract: A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.
-
公开(公告)号:US20180285110A1
公开(公告)日:2018-10-04
申请号:US15477022
申请日:2017-04-01
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Balaji Vembu , Abhishek R. Appu , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu
CPC classification number: G06F9/30123 , G06F9/5016 , G06F9/5027 , G06T1/20 , G06T1/60 , G09G5/363 , G09G5/393 , G09G2352/00 , G09G2360/08
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20180284868A1
公开(公告)日:2018-10-04
申请号:US15477029
申请日:2017-04-01
Applicant: Intel Corporation
Inventor: Altug Koker , Abhishek R. Appu , Kiran C. Veernapu , Joydeep Ray , Balaji Vembu , Prasoonkumar Surti , Kamal Sinha , Eric J. Hoekstra , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy
Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20250117873A1
公开(公告)日:2025-04-10
申请号:US18906790
申请日:2024-10-04
Applicant: Intel Corporation
Inventor: Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries
Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.
-
公开(公告)号:US20250117356A1
公开(公告)日:2025-04-10
申请号:US18915492
申请日:2024-10-15
Applicant: Intel Corporation
Inventor: Abhishek R. Appu , Altug Koker , Aravindh Anantaraman , Elmoustapha Ould-Ahmed-Vall , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Mike Macpherson , Subramaniam Maiyuran , Joydeep Ray , Lakshminarayanan Striramassarma , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter , Prasoonkumar Surti , David Puffer , James Valerio , Ankur N. Shah
IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06N3/08 , G06T1/20 , G06T1/60 , G06T15/06 , H03M7/46
Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, a graphics processor includes an interposer, a first chiplet coupled with the interposer, the first chiplet including a graphics processing resource and an interconnect network coupled with the graphics processing resource, cache circuitry coupled with the graphics processing resource via the interconnect network, and a second chiplet coupled with the first chiplet via the interposer, the second chiplet including a memory-side cache and a memory controller coupled with the memory-side cache. The memory controller is configured to enable access to a high-bandwidth memory (HBM) device, the memory-side cache is configured to cache data associated with a memory access performed via the memory controller, and the cache circuitry is logically positioned between the graphics processing resource and a chiplet interface.
-
-
-
-
-
-
-
-
-