-
公开(公告)号:US20220129364A1
公开(公告)日:2022-04-28
申请号:US17571373
申请日:2022-01-07
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
Abstract: A computer-implemented method that includes monitoring execution of program code by first and second processor components. A computing system detects that a trigger condition is satisfied by: i) identifying an operand in a portion of the program code; or ii) determining that a current time of a clock of the computing system indicates a predefined time value. The operand and the predefined time value are used to initiate trace events. When the trigger condition is satisfied the system initiates trace events that generate trace data identifying respective hardware events occurring across the computing system. The system uses the trace data to generate a correlated set of trace data. The correlated trace data indicates a time ordered sequence of the respective hardware events. The system uses the correlated set of trace data to analyze performance of the executing program code.
-
公开(公告)号:US11232012B2
公开(公告)日:2022-01-25
申请号:US16520558
申请日:2019-07-24
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
Abstract: A computer-implemented method that includes monitoring execution of program code by first and second processor components. A computing system detects that a trigger condition is satisfied by: i) identifying an operand in a portion of the program code; or ii) determining that a current time of a clock of the computing system indicates a predefined time value. The operand and the predefined time value are used to initiate trace events. When the trigger condition is satisfied the system initiates trace events that generate trace data identifying respective hardware events occurring across the computing system. The system uses the trace data to generate a correlated set of trace data. The correlated trace data indicates a time ordered sequence of the respective hardware events. The system uses the correlated set of trace data to analyze performance of the executing program code.
-
公开(公告)号:US20200279163A1
公开(公告)日:2020-09-03
申请号:US16878720
申请日:2020-05-20
Applicant: Google LLC
Inventor: Samuel Bengio , Mohammad Norouzi , Benoit Steiner , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le , Naveen Kumar , Yuefeng Zhou , Rasmus Munk Larsen
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.
-
公开(公告)号:US20190303761A1
公开(公告)日:2019-10-03
申请号:US16445330
申请日:2019-06-19
Applicant: Google LLC
Inventor: Samy Bengio , Mohammad Edward Norouzi , Benoit Steiner , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le , Naveen Kumar , Yuefeng Zhou , Rasmus Munk Larsen
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.
-
公开(公告)号:US20180285226A1
公开(公告)日:2018-10-04
申请号:US15875160
申请日:2018-01-19
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
CPC classification number: G06F11/302 , G06F9/542 , G06F11/3072 , G06F11/3075 , G06F11/3476 , G06F11/3495 , G06F11/3636 , G06F17/30044 , G06F2201/86 , G06F2201/865
Abstract: A computer-implemented method executed by one or more processors, the method includes monitoring execution of program code executed by a first processor component; and monitoring execution of program code executed by a second processor component. A computing system stores data identifying hardware events in a memory buffer. The stored events occur across processor units that include at least the first and second processor components. The hardware events each include an event time stamp and metadata characterizing the event. The system generates a data structure identifying the hardware events. The data structure arranges the events in a time ordered sequence and associates events with at least the first or second processor components. The system stores the data structure in a memory bank of a host device and uses the data structure to analyze performance of the program code executed by the first or second processor components.
-
公开(公告)号:US11650895B2
公开(公告)日:2023-05-16
申请号:US17240838
申请日:2021-04-26
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
CPC classification number: G06F11/302 , G06F9/542 , G06F11/3072 , G06F11/3075 , G06F11/3476 , G06F11/3495 , G06F11/3636 , G06F16/489 , G06F2201/86 , G06F2201/865
Abstract: A computer-implemented method executed by one or more processors, the method includes monitoring execution of program code executed by a first processor component; and monitoring execution of program code executed by a second processor component. A computing system stores data identifying hardware events in a memory buffer. The stored events occur across processor units that include at least the first and second processor components. The hardware events each include an event time stamp and metadata characterizing the event. The system generates a data structure identifying the hardware events. The data structure arranges the events in a time ordered sequence and associates events with at least the first or second processor components. The system stores the data structure in a memory bank of a host device and uses the data structure to analyze performance of the program code executed by the first or second processor components.
-
公开(公告)号:US20210248052A1
公开(公告)日:2021-08-12
申请号:US17240838
申请日:2021-04-26
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
Abstract: A computer-implemented method executed by one or more processors, the method includes monitoring execution of program code executed by a first processor component; and monitoring execution of program code executed by a second processor component. A computing system stores data identifying hardware events in a memory buffer. The stored events occur across processor units that include at least the first and second processor components. The hardware events each include an event time stamp and metadata characterizing the event. The system generates a data structure identifying the hardware events. The data structure arranges the events in a time ordered sequence and associates events with at least the first or second processor components. The system stores the data structure in a memory bank of a host device and uses the data structure to analyze performance of the program code executed by the first or second processor components.
-
公开(公告)号:US10990494B2
公开(公告)日:2021-04-27
申请号:US16665355
申请日:2019-10-28
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
Abstract: A computer-implemented method executed by one or more processors, the method includes monitoring execution of program code executed by a first processor component; and monitoring execution of program code executed by a second processor component. A computing system stores data identifying hardware events in a memory buffer. The stored events occur across processor units that include at least the first and second processor components. The hardware events each include an event time stamp and metadata characterizing the event. The system generates a data structure identifying the hardware events. The data structure arranges the events in a time ordered sequence and associates events with at least the first or second processor components. The system stores the data structure in a memory bank of a host device and uses the data structure to analyze performance of the program code executed by the first or second processor components.
-
公开(公告)号:US10692003B2
公开(公告)日:2020-06-23
申请号:US16445330
申请日:2019-06-19
Applicant: Google LLC
Inventor: Samuel Bengio , Mohammad Norouzi , Benoit Steiner , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le , Naveen Kumar , Yuefeng Zhou , Rasmus Munk Larsen
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.
-
公开(公告)号:US20190332509A1
公开(公告)日:2019-10-31
申请号:US16411569
申请日:2019-05-14
Applicant: Google LLC
Inventor: Thomas Norrie , Naveen Kumar
Abstract: A computer-implemented method executed by one or more processors, the method includes monitoring execution of program code executed by a first processor component; and monitoring execution of program code executed by a second processor component. A computing system stores data identifying hardware events in a memory buffer. The stored events occur across processor units that include at least the first and second processor components. The hardware events each include an event time stamp and metadata characterizing the event. The system generates a data structure identifying the hardware events. The data structure arranges the events in a time ordered sequence and associates events with at least the first or second processor components. The system stores the data structure in a memory bank of a host device and uses the data structure to analyze performance of the program code executed by the first or second processor components.
-
-
-
-
-
-
-
-
-