- Patent Title: Highly performant pipeline parallel deep neural network training
-
Application No.: US16024369Application Date: 2018-06-29
-
Publication No.: US12056604B2Publication Date: 2024-08-06
- Inventor: Vivek Seshadri , Amar Phanishayee , Deepak Narayanan , Aaron Harlap , Nikhil Devanur Rangarajan
- Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
- Applicant Address: US WA Redmond
- Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee Address: US WA Redmond
- Agency: Newport IP, LLC
- Agent Jacob P. Rohwer
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/04

Abstract:
Layers of a deep neural network (DNN) are partitioned into stages using a profile of the DNN. Each of the stages includes one or more of the layers of the DNN. The partitioning of the layers of the DNN into stages is optimized in various ways including optimizing the partitioning to minimize training time, to minimize data communication between worker computing devices used to train the DNN, or to ensure that the worker computing devices perform an approximately equal amount of the processing for training the DNN. The stages are assigned to the worker computing devices. The worker computing devices process batches of training data using a scheduling policy that causes the workers to alternate between forward processing of the batches of the DNN training data and backward processing of the batches of the DNN training data. The stages can be configured for model parallel processing or data parallel processing.
Information query