-
公开(公告)号:GB2549906B
公开(公告)日:2021-07-28
申请号:GB201712265
申请日:2015-12-16
Applicant: IBM
Inventor: JEFFREY BROWNSCHEIDLE , DUNG QUOC NGUYEN , MAUREEN ANNE DELANEY , SUNDEEP CHADHA , HUNG QUI LE , BRIAN WILLIAM THOMPTO
IPC: G06F9/38
Abstract: An execution slice circuit for a processor core has multiple parallel instruction execution slices and provides flexible and efficient use of internal resources. The execution slice circuit includes a master execution slice for receiving instructions of a first instruction stream and a slave execution slice for receiving instructions of a second instruction stream and instructions of the first instruction stream that require an execution width greater than a width of the slices. The execution slice circuit also includes a control logic that detects when a first instruction of the first instruction stream has the greater width and controls the slave execution slice to reserve a first issue cycle for issuing the first instruction in parallel across the master execution slice and the slave execution slice.
-
公开(公告)号:GB2549907A
公开(公告)日:2017-11-01
申请号:GB201712270
申请日:2015-12-29
Applicant: IBM
Inventor: SUNDEEP CHADHA , DAVID ALLEN HRUSECKY , DUNG QUOC NGUYEN , HUNG QUI LE , BRIAN WILLIAM THOMPTO , ROBERT ALLEN CORDES , SALMA AYUB
Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load- store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
-
公开(公告)号:GB2606908A
公开(公告)日:2022-11-23
申请号:GB202209610
申请日:2020-11-30
Applicant: IBM
Inventor: JENTJE LEENSTRA , ANDREAS WAGNER , JOSE EDUARDO MOREIRA , BRIAN WILLIAM THOMPTO
IPC: G06F9/302
Abstract: A processor unit for multiply and accumulate ("MAC") operations is provided, the processor unit comprising: a plurality of MAC units for performing a set of MAC operations, wherein each MAC unit of the plurality of MAC units including an execution unit and a one-write one-read ("1W/1R") register file, wherein the 1W/1R register file having at least one accumulator; and another register file, wherein the execution unit of each MAC unit being configured to perform a subset of MAC operations by computing a product of a set of values received from the another register file and adding the computed product to a content of the at least one accumulator, wherein each MAC unit being configured to perform the subset of MAC operations in a single clock cycle.
-
公开(公告)号:GB2604085A
公开(公告)日:2022-08-24
申请号:GB202209153
申请日:2020-11-19
Applicant: IBM
Inventor: HUNG QUI LE , BRIAN DAVID BARRICK , SUSAN EISEN , DUNG QUOC NGUYEN , ANDREAS WAGNER , BRIAN WILLIAM THOMPTO , KENNETH WARD , STEVEN BATTLE
Abstract: A computer system, processor (110), and method for processing information is disclosed that includes at least one processor (110) having a main register file (380), the main register file (380) having a plurality of entries (381) for storing data; one or more execution units including a dense math execution unit (460); and at least one accumulator register file (470), the at least one accumulator register file (470) associated with the dense math execution unit (460). The processor (110) in an embodiment is configured to process data in the dense math execution unit (460) where the results of the dense math execution unit (460) are written to a first group of one or more accumulator register file entries (471), and after a checkpoint boundary is crossed based upon, for example, the number "N" of instructions dispatched after the start of the checkpoint, the results of the dense math execution unit (460) are written to a second group of one or more accumulator register file entries (471).
-
公开(公告)号:GB2604085B
公开(公告)日:2022-12-07
申请号:GB202209153
申请日:2020-11-19
Applicant: IBM
Inventor: HUNG QUI LE , BRIAN DAVID BARRICK , SUSAN EISEN , DUNG QUOC NGUYEN , ANDREAS WAGNER , BRIAN WILLIAM THOMPTO , KENNETH WARD , STEVEN BATTLE
Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one processor having a main register file, the main register file having a plurality of entries for storing data; one or more execution units including a dense math execution unit; and at least one accumulator register file, the at least one accumulator register file associated with the dense math execution unit. The processor in an embodiment is configured to process data in the dense math execution unit where the results of the dense math execution unit are written to a first group of one or more accumulator register file entries, and after a checkpoint boundary is crossed based upon, for example, the number “N” of instructions dispatched after the start of the checkpoint, the results of the dense math execution unit are written to a second group of one or more accumulator register file entries.
-
公开(公告)号:GB2549907B
公开(公告)日:2021-08-11
申请号:GB201712270
申请日:2015-12-29
Applicant: IBM
Inventor: SUNDEEP CHADHA , DAVID ALLEN HRUSECKY , DUNG QUOC NGUYEN , HUNG QUI LE , BRIAN WILLIAM THOMPTO , ROBERT ALLEN CORDES , SALMA AYUB
Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load-store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
-
公开(公告)号:GB2549906A
公开(公告)日:2017-11-01
申请号:GB201712265
申请日:2015-12-16
Applicant: IBM
Inventor: JEFFREY BROWNSCHEIDLE , DUNG QUOC NGUYEN , MAUREEN ANNE DELANEY , SUNDEEP CHADHA , HUNG QUI LE , BRIAN WILLIAM THOMPTO
IPC: G06F9/38
Abstract: An execution slice circuit for a processor core has multiple parallel instruction execution slices and provides flexible and efficient use of internal resources. The execution slice circuit includes a master execution slice for receiving instructions of a first instruction stream and a slave execution slice for receiving instructions of a second instruction stream and instructions of the first instruction stream that require an execution width greater than a width of the slices. The execution slice circuit also includes a control logic that detects when a first instruction of the first instruction stream has the greater width and controls the slave execution slice to reserve a first issue cycle for issuing the first instruction in parallel across the master execution slice and the slave execution slice.
-
公开(公告)号:GB2486155B
公开(公告)日:2017-04-19
申请号:GB201206367
申请日:2010-12-13
Applicant: IBM
Inventor: CHRISTIAN JACOBI , BRIAN WILLIAM THOMPTO , GREGORY WILLIAM ALEXANDER , KHARY JASON ALEXANDER , BRIAN WILLIAM CURRAN , JAMES RUSSELL MITCHELL , JONATHAN TING HSIEH , BRIAN ROBERT PRASKY
IPC: G06F9/38
-
-
-
-
-
-
-