Invention Grant
- Patent Title: Distributed training and prediction using elastic resources
-
Application No.: US15785074Application Date: 2017-10-16
-
Publication No.: US11003992B2Publication Date: 2021-05-11
- Inventor: Lukasz Wesolowski , Mohamed Fawzi Mokhtar Abd El Aziz , Aditya Rajkumar Kalro , Hongzhong Jia , Jay Parikh
- Applicant: Facebook, Inc.
- Applicant Address: US CA Menlo Park
- Assignee: Facebook, Inc.
- Current Assignee: Facebook, Inc.
- Current Assignee Address: US CA Menlo Park
- Agency: Baker Botts L.L.P.
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/10

Abstract:
In one embodiment, a method includes establishing access to first and second different computing systems. A machine learning model is assigned for training to the first computing system, and the first computing system creates a check-point during training in response to a first predefined triggering event. The check-point may be a record of an execution state in the training of the machine learning model by the first computing system. In response to a second predefined triggering event, the training of the machine learning model on the first computing system is halted, and in response to a third predefined triggering event, the training of the machine learning model is transferred to the second computing system, which continues training the machine learning model starting from the execution state recorded by the check-point.
Public/Granted literature
- US20190114537A1 DISTRIBUTED TRAINING AND PREDICTION USING ELASTIC RESOURCES Public/Granted day:2019-04-18
Information query