-
公开(公告)号:US11704565B2
公开(公告)日:2023-07-18
申请号:US17685462
申请日:2022-03-03
Applicant: Intel Corporation
Inventor: Srinivas Sridharan , Karthikeyan Vaidyanathan , Dipankar Das , Chandrasekaran Sakthivel , Mikhail E. Smorkalov
IPC: G06N3/08 , G06N3/088 , G06F9/50 , G06N3/084 , G06N3/044 , G06N3/045 , G06N3/04 , G06N3/063 , G06N3/048 , G06N7/01
CPC classification number: G06N3/08 , G06F9/50 , G06F9/5061 , G06F9/5077 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N3/088 , G06N3/048 , G06N7/01
Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library. The instructions cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network. The processor can transparently adjust communication paths between worker nodes based on the communication pattern.
-
公开(公告)号:US20230177328A1
公开(公告)日:2023-06-08
申请号:US17972832
申请日:2022-10-25
Applicant: Intel Corporation
Inventor: Srinivas Sridharan , Karthikeyan Vaidyanathan , Dipankar Das
Abstract: One embodiment provides for a graphics processing unit including a fabric interface configured to transmit gradient data stored in a memory device of the graphics processing unit according to a pre-defined communication operation. The memory device is a physical memory device shared with a compute block of the graphics processing unit and the fabric interface. The fabric interface automatically transmits the gradient data stored in memory to a second distributed training node based on an address of the gradient data in the memory device.
-
公开(公告)号:US11488008B2
公开(公告)日:2022-11-01
申请号:US15869510
申请日:2018-01-12
Applicant: Intel Corporation
Inventor: Srinivas Sridharan , Karthikeyan Vaidyanathan , Dipankar Das
Abstract: One embodiment provides for a system to compute and distribute data for distributed training of a neural network, the system including first memory to store a first set of instructions including a machine learning framework; a fabric interface to enable transmission and receipt of data associated with the set of trainable machine learning parameters; a first set of general-purpose processor cores to execute the first set of instructions, the first set of instructions to provide a training workflow for computation of gradients for the trainable machine learning parameters and to communicate with a second set of instructions, the second set of instructions facilitate transmission and receipt of the gradients via the fabric interface; and a graphics processor to perform compute operations associated with the training workflow to generate the gradients for the trainable machine learning parameters.
-
公开(公告)号:US11373266B2
公开(公告)日:2022-06-28
申请号:US15869551
申请日:2018-01-12
Applicant: Intel Corporation
Inventor: Dipankar Das , Karthikeyan Vaidyanathan , Srinivas Sridharan
Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.
-
-
-