-
公开(公告)号:US11449309B2
公开(公告)日:2022-09-20
申请号:US16646507
申请日:2019-06-21
Applicant: GRAPHCORE LIMITED
Inventor: Stephen Felix , Mrudula Gore
Abstract: A hardware module comprising circuitry configured to: store a sequence of n bits in a register of the hardware module; generate a signed integer comprising a magnitude component and a sign bit by: if the most significant bit of the sequence of n bits is equal to one: set each of the n−1 of the most significant bits of the magnitude component to be equal to the corresponding bit of the n−1 least significant bits of the sequence of n bits; and set the sign bit to be zero; if the most significant bit of the sequence of n bits is equal to zero: set each of the n−1 of the most significant bits of the magnitude component to be equal to the inverse of the corresponding bit of the n−1 least significant bits of the sequence of n bits; and set the sign bit to be one.
-
公开(公告)号:US20220197857A1
公开(公告)日:2022-06-23
申请号:US17648517
申请日:2022-01-20
Applicant: Graphcore Limited
Inventor: Stephen Felix , Simon Christian KNOWLES
Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.
-
公开(公告)号:US11294635B2
公开(公告)日:2022-04-05
申请号:US16395434
申请日:2019-04-26
Applicant: Graphcore Limited
Inventor: Stephen Felix , James William Hanlon
Abstract: A pseudo random number generator implemented in hardware. The pseudo random number generator comprises a state post processing circuit for processing two state values to produce a random number. The circuit having a first combinatorial logic comprising a XOR or XNOR gate configured to process a first pair of bits from the state values, a second combinatorial logic comprising an OR or AND gate configured to process a second pair of bits from the state value, and third combinatorial logic comprising an OR or AND gate configured or process a third pair of bits from the state value. The circuit has fourth combinatorial logic configured to process the outputs of the first three set of combinatorial logic so as to provide a result bit of the random number. The fourth combinatorial logic comprises an AND or OR gate and a XOR or XNOR gate.
-
公开(公告)号:US11269806B2
公开(公告)日:2022-03-08
申请号:US16419535
申请日:2019-05-22
Applicant: Graphcore Limited
Inventor: Stephen Felix , Simon Christian Knowles
Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column, wherein to implement exchange of data between the processing units at least one processing unit is configured to transmit at a transmit time a data packet intended for a recipient processing unit onto its output set of connection wires, the data packet having no destination identifier of the recipient processing unit but destined for receipt at the recipient processing unit with a predetermined delay relative to the transmit time, wherein the predetermined delay is dependent on an exchange pathway between the transmitting and recipient processing units, wherein the exchange pathway between any pair of transmitting and recipient processing unit at respective positions in one column has the same delay as the exchange pathway between each pair of transmitting and recipient processing units at corresponding respective positions in the other columns.
-
公开(公告)号:US11262787B2
公开(公告)日:2022-03-01
申请号:US16744249
申请日:2020-01-16
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey
Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal a send instruction to transmit at least one data packet at a predetermined transmit time, relative to the synchronisation signal, destined for a recipient processing unit but having no destination identifier, and a local program allocated to the recipient processing unit is scheduled to execute at a predetermined switch time a switch control instruction to control the switching circuitry to connect its processing unit wire to the switching fabric to receive the data packet at a receive time.
-
公开(公告)号:US10970131B2
公开(公告)日:2021-04-06
申请号:US16235265
申请日:2018-12-28
Applicant: Graphcore Limited
Inventor: Ola Tørudbakken , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Matthew David Fyles , Brian Manula , Harald Høeg
IPC: G06F9/52 , G06F16/901 , G06F9/30 , G06F9/38 , G06F9/54 , G06F15/167 , G06F15/173 , H04L12/801 , H04L29/06 , H04L29/08 , G06F9/48
Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.
-
公开(公告)号:US10936008B2
公开(公告)日:2021-03-02
申请号:US15886009
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey
Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection wires, the data packet being destined for at least one recipient processing unit but having no destination identifier, and at a predetermined switch time the recipient processing unit executes a switch control instruction from its local program to control its switching circuitry to connect its input set of wires to the switching fabric to receive the data packet at a receive time, the transmit time and, switch time and receive time being governed by the common clock with respect to the synchronisation signal.
-
公开(公告)号:US10579585B2
公开(公告)日:2020-03-03
申请号:US15886138
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Stephen Felix , Richard Luke Southwell Osborne , Simon Christian Knowles , Alan Graham Alexander , Ian James Quinn
IPC: G06F15/80 , G06F9/52 , G06F15/173
Abstract: A method of operating a system comprising multiple processor tiles divided into a plurality of domains wherein within each domain the tiles are connected to one another via a respective instance of a time-deterministic interconnect and between domains the tiles are connected to one another via a non-time-deterministic interconnect. The method comprises: performing a compute stage, then performing a respective internal barrier synchronization within each domain, then performing an internal exchange phase within each domain, then performing an external barrier synchronization to synchronize between different domains, then performing an external exchange phase between the domains.
-
公开(公告)号:US10558595B2
公开(公告)日:2020-02-11
申请号:US16165607
申请日:2018-10-19
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Graham Bernard Cunningham , Alan Graham Alexander
IPC: G06F13/20 , G06F13/42 , G06F15/16 , G06F15/163
Abstract: A processor comprising multiple tiles on the same chip, and an external interconnect for communicating data off-chip in the form of packets. The external interconnect comprises an external exchange block configured to provide flow control and queuing of the packets. One of the tiles is nominated by the compiler to send an external exchange request message to the exchange block on behalf of others with data to send externally. The exchange sends an exchange-on message to a first of these tiles, to cause the first tile to start sending packets via the external interconnect. Then, once this tile has sent its last data packet, the exchange block sends an exchange-off control packet to this tile to cause it to stop sending packets, and sends another exchange-on message to the next tile with data to send, and so forth.
-
公开(公告)号:US20190121778A1
公开(公告)日:2019-04-25
申请号:US15886315
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Stephen Felix , Jonathan Mangnall
IPC: G06F15/173 , G06F15/80 , G06F1/32
CPC classification number: G06F15/17312 , G06F1/32 , G06F15/80
Abstract: An indication of a direction of transmission over the switching fabric is inserted into a data packet that is transmitted from a tile. The indication of direction may indicate directions from the transmitting tile in which intended recipient tiles are present. The switching fabric prevents (e.g. by blocking the data packet at one of a series of latches) the transmission in a direction not indicated in the data packet. Hence, power saving may be achieved, by preventing the unnecessary transmission of data packets over parts of the switching fabric.
-
-
-
-
-
-
-
-
-