-
公开(公告)号:US12039000B2
公开(公告)日:2024-07-16
申请号:US18163418
申请日:2023-02-02
Applicant: Intel Corporation
Inventor: Joydeep Ray , Fangwen Fu , Dhiraj D. Kalamkar , Sasikanth Avancha
Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
-
公开(公告)号:US20230289399A1
公开(公告)日:2023-09-14
申请号:US18163418
申请日:2023-02-02
Applicant: Intel Corporation
Inventor: Joydeep Ray , Fangwen Fu , Dhiraj D. Kalamkar , Sasikanth Avancha
Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
-
公开(公告)号:US11023803B2
公开(公告)日:2021-06-01
申请号:US15482925
申请日:2017-04-10
Applicant: Intel Corporation
Inventor: Dhiraj D. Kalamkar , Karthikeyan Vaidyanathan , Srinivas Sridharan , Dipankar Das
Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.
-
公开(公告)号:US11798120B2
公开(公告)日:2023-10-24
申请号:US17398295
申请日:2021-08-10
Applicant: Intel Corporation
Inventor: Dhiraj D. Kalamkar , Karthikeyan Vaidyanathan , Srinivas Sridharan , Dipankar Das
Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
-
公开(公告)号:US11094029B2
公开(公告)日:2021-08-17
申请号:US15482953
申请日:2017-04-10
Applicant: Intel Corporation
Inventor: Dhiraj D. Kalamkar , Karthikeyan Vaidyanathan , Srinivas Sridharan , Dipankar Das
Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
-
公开(公告)号:US11593454B2
公开(公告)日:2023-02-28
申请号:US16890122
申请日:2020-06-02
Applicant: Intel Corporation
Inventor: Joydeep Ray , Fangwen Fu , Dhiraj D. Kalamkar , Sasikanth Avancha
Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
-
公开(公告)号:US20210374209A1
公开(公告)日:2021-12-02
申请号:US16890122
申请日:2020-06-02
Applicant: Intel Corporation
Inventor: Joydeep Ray , Fangwen Fu , Dhiraj D. Kalamkar , Sasikanth Avancha
Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
-
公开(公告)号:US10775873B2
公开(公告)日:2020-09-15
申请号:US16288580
申请日:2019-02-28
Applicant: Intel Corporation
Inventor: Victor W. Lee , Edward T. Grochowski , Daehyun Kim , Yuxin Bai , Sheng Li , Naveen K. Mellempudi , Dhiraj D. Kalamkar
IPC: G06F1/00 , G06F1/3287 , G06F1/324 , G06F1/3234 , G06F1/3225 , G06F1/329 , G06F1/3296 , G06F9/50
Abstract: In an embodiment, a processor includes: a plurality of first cores to independently execute instructions, each of the plurality of first cores including a plurality of counters to store performance information; at least one second core to perform memory operations; and a power controller to receive performance information from at least some of the plurality of counters, determine a workload type executed on the processor based at least in part on the performance information, and based on the workload type dynamically migrate one or more threads from one or more of the plurality of first cores to the at least one second core for execution during a next operation interval. Other embodiments are described and claimed.
-
公开(公告)号:US20240427842A1
公开(公告)日:2024-12-26
申请号:US18674212
申请日:2024-05-24
Applicant: Intel Corporation
Inventor: Joydeep Ray , Fangwen Fu , Dhiraj D. Kalamkar , Sasikanth Avancha
Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
-
公开(公告)号:US11494163B2
公开(公告)日:2022-11-08
申请号:US16562979
申请日:2019-09-06
Applicant: Intel Corporation
Inventor: Naveen Mellempudi , Dipankar Das , Chunhui Mei , Kristopher Wong , Dhiraj D. Kalamkar , Hong H. Jiang , Subramaniam Maiyuran , Varghese George
Abstract: An apparatus to facilitate a computer number format conversion is disclosed. The apparatus comprises a control unit to receive to receive data format information indicating a first precision data format that input data is to be received and converter hardware to receive the input data and convert the first precision data format to a second precision data format based on the data format information.
-
-
-
-
-
-
-
-
-