-
公开(公告)号:US11709794B2
公开(公告)日:2023-07-25
申请号:US17451372
申请日:2021-10-19
Applicant: Graphcore Limited
Inventor: Stephen Felix , Richard Luke Southwell Osborne , Alan Graham Alexander
Abstract: Two or more die are stacked together in a stacked integrated circuit device. Each of the processors on these die is able to communicate with other processors on its die by sending data over the switching fabric of its respective die. The mechanism for sending data between processors on the same die (i.e. intradie communication) is reused for sending data between processors on different die (i.e. interdie communication). The reuse of the mechanism is enabled by assigning each processor a vertical neighbour on its opposing die. Each processor has an interdie connection that connects it to the output exchange bus of its neighbour. A processor is able to borrow the output exchange bus of its neighbour by sending data along the output exchange bus of its neighbour.
-
公开(公告)号:US11561926B2
公开(公告)日:2023-01-24
申请号:US17648517
申请日:2022-01-20
Applicant: Graphcore Limited
Inventor: Stephen Felix , Simon Christian Knowles
Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.
-
公开(公告)号:US11462293B2
公开(公告)日:2022-10-04
申请号:US17443061
申请日:2021-07-20
Applicant: Graphcore Limited
Inventor: Graham Bernard Cunningham , Stephen Felix
Abstract: A memory controller is provided for reading and writing to and from a memory module. The memory controller implements an error correction algorithm, which calculates error correction code for message data to be written to the memory module and checks the error correction code against the message data when the data is read out of the memory module. The memory controller spreads each codeword over at least four different beats sent over the interface with the memory module, with each beat comprising a symbol of error correction code. Bits of a particular symbol of message data occupy the same positions in different beats. Since the bits of the symbols occupy the same positions in different beat, the number of bits affected by a hardware error is minimised. With four symbols of error correction code available for use in the codeword.
-
公开(公告)号:US11119559B2
公开(公告)日:2021-09-14
申请号:US16428797
申请日:2019-05-31
Applicant: Graphcore Limited
Inventor: Stephen Felix , Mrudula Gore
IPC: G06F1/324 , G06F1/06 , G06F1/08 , G06F1/3206
Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.
-
公开(公告)号:US11023290B2
公开(公告)日:2021-06-01
申请号:US15885972
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Simon Christian Knowles , Matthew David Fyles , Alan Graham Alexander , Stephen Felix
Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
-
公开(公告)号:US10963003B2
公开(公告)日:2021-03-30
申请号:US16165978
申请日:2018-10-19
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey
Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection wires, the data packet being destined for at least one recipient processing unit but having no destination identifier, and at a predetermined switch time the recipient processing unit executes a switch control instruction from its local program to control its switching circuitry to connect its input set of wires to the switching fabric to receive the data packet at a receive time, the transmit time and, switch time and receive time being governed by the common clock with respect to the synchronisation signal.
-
公开(公告)号:US10802536B2
公开(公告)日:2020-10-13
申请号:US15886053
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey
Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal a send instruction to transmit at least one data packet at a predetermined transmit time, relative to the synchronisation signal, destined for a recipient processing unit but having no destination identifier, and a local program allocated to the recipient processing unit is scheduled to execute at a predetermined switch time a switch control instruction to control the switching circuitry to connect its processing unit wire to the switching fabric to receive the data packet at a receive time.
-
公开(公告)号:US20200014631A1
公开(公告)日:2020-01-09
申请号:US16235265
申请日:2018-12-28
Applicant: Graphcore Limited
Inventor: Ola Tørudbakken , Daniel John Pelham Wikinson , Richard Luke Sothwell Osborne , Stephen Felix , Matthew David Fyles , Brian Manula , Harald Høeg
IPC: H04L12/801 , H04L29/06 , G06F15/173 , G06F15/167 , H04L29/08
Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.
-
公开(公告)号:US20190310963A1
公开(公告)日:2019-10-10
申请号:US16451128
申请日:2019-06-25
Applicant: Graphcore Limited
Inventor: Stephen Felix , Jonathan Mangnall
IPC: G06F15/173 , G06F1/32 , G06F15/80
Abstract: An indication of a direction of transmission over the switching fabric is inserted into a data packet that is transmitted from a tile. The indication of direction may indicate directions from the transmitting tile in which intended recipient tiles are present. The switching fabric prevents (e.g. by blocking the data packet at one of a series of latches) the transmission in a direction not indicated in the data packet. Hence, power saving may be achieved, by preventing the unnecessary transmission of data packets over parts of the switching fabric.
-
公开(公告)号:US20190121785A1
公开(公告)日:2019-04-25
申请号:US15886185
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Matthew David Fyles , Alan Graham Alexander , Stephen Felix
IPC: G06F15/80 , G06F9/52 , G06F15/173
Abstract: A processing system comprising an arrangement of tiles and synchronization logic in the form of hardware logic for coordinating between a group of some or all of said tiles. The instruction set comprises a synchronization instruction which causes an instance of a synchronization request to be transmitted from the respective tile to the synchronization logic, and suspends instruction issue on the respective tile pending a synchronization acknowledgement. In response to receiving an instance of the synchronization request from all of the tiles of the group, the synchronization logic returns the synchronization acknowledgment back to each of the tiles in the group to allow the instruction issue to resume. The instruction set further comprises an abstain instruction, which sends an instance of the synchronization request but does not suspend instruction issue on the respective tile pending the synchronization acknowledgement, instead allowing the instruction issue on the respective tile to continue.
-
-
-
-
-
-
-
-
-