Invention Grant
- Patent Title: Elastic management of machine learning computing
-
Application No.: US15951088Application Date: 2018-04-11
-
Publication No.: US10649806B2Publication Date: 2020-05-12
- Inventor: Aurick Qiao , Qirong Ho , Eric Xing
- Applicant: Petuum Inc.
- Applicant Address: US PA Pittsburgh
- Assignee: PETUUM, INC.
- Current Assignee: PETUUM, INC.
- Current Assignee Address: US PA Pittsburgh
- Main IPC: G06F9/48
- IPC: G06F9/48 ; G06F9/50 ; G06F9/28 ; G06N20/00

Abstract:
A computer system implemented a method for elastic resource management for executing a machine learning (ML) program. The system is configured to create a set of logical executors, assign them across a set of networked physical computation units of a distributed computing system, partition and distribute input data and Work Tasks across the set of logical executors, assign them across the set of networked physical computation units, where the Work Tasks are partitioned into short units of computation (micro-tasks), each calculates a partial update to the ML program's model parameters and each last for less than one second; create a set of logical servers (LSes); partition and distribute globally shared model parameters of the ML program across the set of logical servers; execute partitioned Work Tasks according to a bounded asynchronous parallel standard, where a current Work Task is allowed to execute with stale model parameters without having all the current calculation updates from Work Tasks it depend on, provided the staleness of the model parameters is within a predefined limit.
Public/Granted literature
- US20180300171A1 Elastic Management of Machine Learning Computing Public/Granted day:2018-10-18
Information query