-
公开(公告)号:US11847428B2
公开(公告)日:2023-12-19
申请号:US17660688
申请日:2022-04-26
Applicant: Graphcore Limited
Inventor: Jonathan Mangnall , Stephen Felix
CPC classification number: G06F7/483 , G06F1/03 , G06F7/535 , G06F7/5525 , G06F2101/12 , G06F2207/5355 , G06F2207/5356
Abstract: An execution unit for a processor, the execution unit comprising: a look up table having a plurality of entries, each of the plurality of entries comprising an initial estimate for a result of an operation; a preparatory circuit configured to search the look up table using an index value dependent upon the operand to locate an entry comprising a first initial estimate for a result of the operation; a plurality of processing circuits comprising at least one multiplier circuit; and control circuitry configured to provide the first initial estimate to the at least one multiplier circuit of the plurality of processing circuits so as perform processing, by the plurality of processing units, of the first initial estimate to generate the function result, said processing comprising applying one or more Newton Raphson iterations to the first initial estimate.
-
公开(公告)号:US11416440B2
公开(公告)日:2022-08-16
申请号:US16790215
申请日:2020-02-13
Applicant: Graphcore Limited
Inventor: Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix
IPC: G06F9/30 , G06F15/173 , G06F9/52 , G06F15/80
Abstract: A computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising: a switch control instruction which when executed causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined received time, the switch control instruction comprising a delay control field which holds a value defining a delay between issuance of the instruction in the sequence of instructions and its execution by the execution unit.
-
公开(公告)号:US11169778B2
公开(公告)日:2021-11-09
申请号:US16523156
申请日:2019-07-26
Applicant: Graphcore Limited
Inventor: Stephen Felix , Mrudula Gore , Alan Graham Alexander
Abstract: A hardware module comprising at least one of: one or more field programmable gate arrays and one or more application specific integrated circuits configured to: receive a number in floating-point representation at a first precision level, the number comprising an exponent and a first mantissa; apply a first random number to the first mantissa to generate a first carry; truncate the first mantissa to a level specified by a second precision level; add the first carry to the least significant bit of the mantissa truncated to the level specified by the second precision level to form a mantissa for the number in floating-point representation at the second precision level.
-
公开(公告)号:US11106510B2
公开(公告)日:2021-08-31
申请号:US16514311
申请日:2019-07-17
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Stephen Felix , Matthew David Fyles , Richard Luke Southwell Osborne
Abstract: A processing system comprising: a subsystem for acting as a work accelerator to a host processor, the subsystem comprising an arrangement of tiles; and an interconnect for communicating between the tiles and connecting the subsystem to the host. The interconnect comprises synchronization logic to coordinate barrier synchronizations between a group of the tiles. The synchronization logic comprises a host sync proxy module, comprising a counter written with a number of credits by the host processor, and being configured to automatically decrement the number of credits each time one of the barrier synchronizations requiring host involvement is performed. When the number of credits in the counter is exhausted, the barrier is not released until a further write from the host to the host sync proxy module, but when the number is credits in the counter is not exhausted the barrier is released without a separate write from the host.
-
公开(公告)号:US11106432B2
公开(公告)日:2021-08-31
申请号:US16427794
申请日:2019-05-31
Applicant: Graphcore Limited
Inventor: Jonathan Mangnall , Stephen Felix
Abstract: An execution unit is described which is particularly configured to generate an exponential of an operand floating point format. The operand is multiplied by a fixed multiplicand, logged to the base 2 (e) to generate a multiplication result. An integer part and a fractional part are extracted from the multiplication result. An exponent register stores the integer part to form the exponent of the exponential result. A lookup table has a plurality of entries each providing a value of 2f for a fractional part f used to access a lookup table. The fractional part is derived from a mantissa of the operand. That is, first and second bit sequences are extracted from the mantissa. One of the bit sequences is used to generate an estimated fractional component, and the other is used to access a value from the lookup table.
-
公开(公告)号:US11048563B2
公开(公告)日:2021-06-29
申请号:US15886065
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Stephen Felix , Matthew David Fyles , Richard Luke Southwell Osborne
Abstract: A processing system comprising: a subsystem for acting as a work accelerator to a host processor, the subsystem comprising an arrangement of tiles; and an interconnect for communicating between the tiles and connecting the subsystem to the host. The interconnect comprises synchronization logic to coordinate barrier synchronizations between a group of the tiles. The synchronization logic comprises a host sync proxy module, comprising a counter written with a number of credits by the host processor, and being configured to automatically decrement the number of credits each time one of the barrier synchronizations requiring host involvement is performed. When the number of credits in the counter is exhausted, the barrier is not released until a further write from the host to the host sync proxy module, but when the number is credits in the counter is not exhausted the barrier is released without a separate write from the host.
-
公开(公告)号:US10817459B2
公开(公告)日:2020-10-27
申请号:US16451128
申请日:2019-06-25
Applicant: Graphcore Limited
Inventor: Stephen Felix , Jonathan Mangnall
IPC: G06F15/173 , G06F1/32 , G06F15/80
Abstract: An indication of a direction of transmission over the switching fabric is inserted into a data packet that is transmitted from a tile. The indication of direction may indicate directions from the transmitting tile in which intended recipient tiles are present. The switching fabric prevents (e.g. by blocking the data packet at one of a series of latches) the transmission in a direction not indicated in the data packet. Hence, power saving may be achieved, by preventing the unnecessary transmission of data packets over parts of the switching fabric.
-
公开(公告)号:US20200150713A1
公开(公告)日:2020-05-14
申请号:US16744249
申请日:2020-01-16
Applicant: Graphcore Limited
Inventor: Simon Christian Knowles , Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Alan Graham Alexander , Stephen Felix , Jonathan Mangnall , David Lacey
Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal a send instruction to transmit at least one data packet at a predetermined transmit time, relative to the synchronisation signal, destined for a recipient processing unit but having no destination identifier, and a local program allocated to the recipient processing unit is scheduled to execute at a predetermined switch time a switch control instruction to control the switching circuitry to connect its processing unit wire to the switching fabric to receive the data packet at a receive time.
-
公开(公告)号:US20190354494A1
公开(公告)日:2019-11-21
申请号:US16525833
申请日:2019-07-30
Applicant: Graphcore Limited
Inventor: Daniel John Pelham Wilkinson , Richard Luke Southwell Osborne , Stephen Felix , Graham Bernard Cunningham , Alan Graham Alexander
Abstract: A system comprising an arrangement of multiple processor modules, and an external interconnect for communicating data in the form of packets to outside the arrangement. The interconnect comprises an exchange block configured to provide flow control. One of the processor modules is arranged to send an exchange request message to the exchange block on behalf of others with data to send outside the arrangement. The exchange block sends an exchange-on message to a first of these processor modules, to cause the first module to start sending packets via the interconnect. Then, once this processor module has sent its last data packet, the exchange block sends an exchange-off message to this processor module to cause it to stop sending packets, and sends another exchange-on message to the next processor module with data to send, and so forth.
-
公开(公告)号:US10360175B2
公开(公告)日:2019-07-23
申请号:US15886315
申请日:2018-02-01
Applicant: Graphcore Limited
Inventor: Stephen Felix , Jonathan Mangnall
IPC: G06F15/173 , G06F1/32 , G06F15/80
Abstract: An indication of a direction of transmission over the switching fabric is inserted into a data packet that is transmitted from a tile. The indication of direction may indicate directions from the transmitting tile in which intended recipient tiles are present. The switching fabric prevents (e.g. by blocking the data packet at one of a series of latches) the transmission in a direction not indicated in the data packet. Hence, power saving may be achieved, by preventing the unnecessary transmission of data packets over parts of the switching fabric.
-
-
-
-
-
-
-
-
-