-
公开(公告)号:US11449338B2
公开(公告)日:2022-09-20
申请号:US16395386
申请日:2019-04-26
Applicant: Graphcore Limited
Inventor: Alan Graham Alexander , Matthew David Fyles
Abstract: A multi-tile processing system has a plurality of tiles each having an execution unit, and an interconnect operable to conduct communications between a group of the tiles according to a bulk synchronous parallel scheme. The execution unit is operable to execute instructions of an instruction set which has a synchronisation instruction for execution by each tile upon completion of its compute phase. The execution of the synchronisation instruction depends on the state of an exception enable flag. In one state, the synchronisation instruction causes the execution unit to send the synchronisation request to hardware logic in the interconnect. In another state of the exception enable flag the synchronisation instruction does not send the synchronisation request, but sets an exception events status to permit interrogation access to the tile. A corresponding method of controlling the debug states of the processing system is provided.
-
公开(公告)号:US10963315B2
公开(公告)日:2021-03-30
申请号:US16276834
申请日:2019-02-15
Applicant: Graphcore Limited
Inventor: David Lacey , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Matthew David Fyles
IPC: G06F9/52 , G06F9/30 , G06F9/38 , G06F9/54 , G06F9/48 , G06F16/901 , G06F15/167 , G06F15/173 , H04L12/801 , H04L29/06 , H04L29/08
Abstract: A system comprising: a first subsystem comprising one or more first processors, and a second subsystem comprising one or more second processors. The second subsystem is configured to process code over a series of steps delineated by barrier synchronizations, and in a current step, to send a descriptor to the first subsystem specifying a value of each of one or more parameters of each of one or more interactions that the second subsystem is programmed to perform with the first subsystem via an inter-processor interconnect in a subsequent step. The first subsystem is configured to execute a portion of code to perform one or more preparatory operations, based on the specified values of at least one of the one or more parameters of each interaction as specified by the descriptor, to prepare for said one or more interactions prior to the barrier synchronization leading into the subsequent phase.
-
公开(公告)号:US20190121680A1
公开(公告)日:2019-04-25
申请号:US15886065
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Stephen Felix , Matthew David Fyles , Richard Luke Southwell Osborne
Abstract: A processing system comprising: a subsystem for acting as a work accelerator to a host processor, the subsystem comprising an arrangement of tiles; and an interconnect for communicating between the tiles and connecting the subsystem to the host. The interconnect comprises synchronization logic to coordinate barrier synchronizations between a group of the tiles. The synchronization logic comprises a host sync proxy module, comprising a counter written with a number of credits by the host processor, and being configured to automatically decrement the number of credits each time one of the barrier synchronizations requiring host involvement is performed. When the number of credits in the counter is exhausted, the barrier is not released until a further write from the host to the host sync proxy module, but when the number is credits in the counter is not exhausted the barrier is released without a separate write from the host.
-
公开(公告)号:US11023290B2
公开(公告)日:2021-06-01
申请号:US15885972
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Simon Christian Knowles , Matthew David Fyles , Alan Graham Alexander , Stephen Felix
Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
-
公开(公告)号:US10949266B2
公开(公告)日:2021-03-16
申请号:US16538980
申请日:2019-08-13
Applicant: Graphcore Limited
Inventor: David Lacey , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Matthew David Fyles
IPC: G06F16/901 , G06F9/30 , G06F15/167 , G06F15/173 , G06F9/48 , G06F9/52 , G06F9/38 , G06F9/54 , H04L12/801 , H04L29/06 , H04L29/08
Abstract: A system comprising: a first subsystem comprising one or more first processors, and a second subsystem comprising one or more second processors. The second subsystem is configured to process code over a series of steps delineated by barrier synchronizations, and in a current step, to send a descriptor to the first subsystem specifying a value of each of one or more parameters of each of one or more interactions that the second subsystem is programmed to perform with the first subsystem via an inter-processor interconnect in a subsequent step. The first subsystem is configured to execute a portion of code to perform one or more preparatory operations, based on the specified values of at least one of the one or more parameters of each interaction as specified by the descriptor, to prepare for said one or more interactions prior to the barrier synchronization leading into the subsequent phase.
-
公开(公告)号:US20200014631A1
公开(公告)日:2020-01-09
申请号:US16235265
申请日:2018-12-28
Applicant: Graphcore Limited
Inventor: Ola Tørudbakken , Daniel John Pelham Wikinson , Richard Luke Sothwell Osborne , Stephen Felix , Matthew David Fyles , Brian Manula , Harald Høeg
IPC: H04L12/801 , H04L29/06 , G06F15/173 , G06F15/167 , H04L29/08
Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.
-
公开(公告)号:US20190121785A1
公开(公告)日:2019-04-25
申请号:US15886185
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Matthew David Fyles , Alan Graham Alexander , Stephen Felix
IPC: G06F15/80 , G06F9/52 , G06F15/173
Abstract: A processing system comprising an arrangement of tiles and synchronization logic in the form of hardware logic for coordinating between a group of some or all of said tiles. The instruction set comprises a synchronization instruction which causes an instance of a synchronization request to be transmitted from the respective tile to the synchronization logic, and suspends instruction issue on the respective tile pending a synchronization acknowledgement. In response to receiving an instance of the synchronization request from all of the tiles of the group, the synchronization logic returns the synchronization acknowledgment back to each of the tiles in the group to allow the instruction issue to resume. The instruction set further comprises an abstain instruction, which sends an instance of the synchronization request but does not suspend instruction issue on the respective tile pending the synchronization acknowledgement, instead allowing the instruction issue on the respective tile to continue.
-
公开(公告)号:US10970131B2
公开(公告)日:2021-04-06
申请号:US16235265
申请日:2018-12-28
Applicant: Graphcore Limited
Inventor: Ola Tørudbakken , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Matthew David Fyles , Brian Manula , Harald Høeg
IPC: G06F9/52 , G06F16/901 , G06F9/30 , G06F9/38 , G06F9/54 , G06F15/167 , G06F15/173 , H04L12/801 , H04L29/06 , H04L29/08 , G06F9/48
Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.
-
公开(公告)号:US20200012536A1
公开(公告)日:2020-01-09
申请号:US16276834
申请日:2019-02-15
Applicant: Graphcore Limited
Inventor: David Lacey , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Matthew David Fyles
IPC: G06F9/52 , G06F9/38 , G06F9/30 , G06F9/54 , G06F16/901
Abstract: A system comprising: a first subsystem comprising one or more first processors, and a second subsystem comprising one or more second processors. The second subsystem is configured to process code over a series of steps delineated by barrier synchronizations, and in a current step, to send a descriptor to the first subsystem specifying a value of each of one or more parameters of each of one or more interactions that the second subsystem is programmed to perform with the first subsystem via an inter-processor interconnect in a subsequent step. The first subsystem is configured to execute a portion of code to perform one or more preparatory operations, based on the specified values of at least one of the one or more parameters of each interaction as specified by the descriptor, to prepare for said one or more interactions prior to the barrier synchronization leading into the subsequent phase.
-
公开(公告)号:US11586483B2
公开(公告)日:2023-02-21
申请号:US17320904
申请日:2021-05-14
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Simon Christian Knowles , Matthew David Fyles , Alan Graham Alexander , Stephen Felix
Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
-
-
-
-
-
-
-
-
-