-
1.
公开(公告)号:JP2006114036A
公开(公告)日:2006-04-27
申请号:JP2005294193
申请日:2005-10-06
Applicant: Internatl Business Mach Corp
, インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Inventor: CURRAN BRIAN WILLIAM , KONIGSBURG BRIAN R , HUNG QUI LE , ARNOLD LUICK DAVID , NGUYEN DUNG QUOC
CPC classification number: G06F9/3853 , G06F9/30145 , G06F9/382 , G06F9/3851 , G06F9/3885
Abstract: PROBLEM TO BE SOLVED: To simultaneously execute a plurality of instructions, and thereby efficiently use hardware resources to increase the whole processor throughput.
SOLUTION: A resource vector representing a necessary resource is encoded to a resource field, and the resource field is decoded in the subsequent step in order to derive the resource vector. The resource field is stored in an instruction cache related to respective program instructions. A processor operates in a simultaneous multithreading mode. When validity of a resource is equal to or exceeds a resource requirement of an instruction group, instructions thereof are simultaneously dispatched to hardware resources. A starting bit is inserted into one of the program instructions in order to define the instruction group. The hardware resource is, in particular, an execution unit such as a fixed decimal point unit 56, a load/store unit 58, a floating decimal point unit 60 or a branch processing unit 61.
COPYRIGHT: (C)2006,JPO&NCIPIAbstract translation: 要解决的问题:同时执行多个指令,从而有效地使用硬件资源来增加整个处理器的吞吐量。 解决方案:将表示必要资源的资源向量编码到资源字段,并且在后续步骤中对资源字段进行解码,以便导出资源向量。 资源字段存储在与各个程序指令相关的指令高速缓存中。 处理器以同时多线程模式运行。 当资源的有效性等于或超过指令组的资源需求时,其指令被同时发送到硬件资源。 为了定义指令组,将起始位插入其中一个程序指令。 硬件资源特别是诸如固定小数点单元56,加载/存储单元58,浮动小数点单元60或分支处理单元61之类的执行单元。(C)2006, JPO&NCIPI
-
公开(公告)号:GB2549907A
公开(公告)日:2017-11-01
申请号:GB201712270
申请日:2015-12-29
Applicant: IBM
Inventor: SUNDEEP CHADHA , DAVID ALLEN HRUSECKY , DUNG QUOC NGUYEN , HUNG QUI LE , BRIAN WILLIAM THOMPTO , ROBERT ALLEN CORDES , SALMA AYUB
Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load- store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
-
公开(公告)号:GB2604085A
公开(公告)日:2022-08-24
申请号:GB202209153
申请日:2020-11-19
Applicant: IBM
Inventor: HUNG QUI LE , BRIAN DAVID BARRICK , SUSAN EISEN , DUNG QUOC NGUYEN , ANDREAS WAGNER , BRIAN WILLIAM THOMPTO , KENNETH WARD , STEVEN BATTLE
Abstract: A computer system, processor (110), and method for processing information is disclosed that includes at least one processor (110) having a main register file (380), the main register file (380) having a plurality of entries (381) for storing data; one or more execution units including a dense math execution unit (460); and at least one accumulator register file (470), the at least one accumulator register file (470) associated with the dense math execution unit (460). The processor (110) in an embodiment is configured to process data in the dense math execution unit (460) where the results of the dense math execution unit (460) are written to a first group of one or more accumulator register file entries (471), and after a checkpoint boundary is crossed based upon, for example, the number "N" of instructions dispatched after the start of the checkpoint, the results of the dense math execution unit (460) are written to a second group of one or more accumulator register file entries (471).
-
公开(公告)号:GB2549906B
公开(公告)日:2021-07-28
申请号:GB201712265
申请日:2015-12-16
Applicant: IBM
Inventor: JEFFREY BROWNSCHEIDLE , DUNG QUOC NGUYEN , MAUREEN ANNE DELANEY , SUNDEEP CHADHA , HUNG QUI LE , BRIAN WILLIAM THOMPTO
IPC: G06F9/38
Abstract: An execution slice circuit for a processor core has multiple parallel instruction execution slices and provides flexible and efficient use of internal resources. The execution slice circuit includes a master execution slice for receiving instructions of a first instruction stream and a slave execution slice for receiving instructions of a second instruction stream and instructions of the first instruction stream that require an execution width greater than a width of the slices. The execution slice circuit also includes a control logic that detects when a first instruction of the first instruction stream has the greater width and controls the slave execution slice to reserve a first issue cycle for issuing the first instruction in parallel across the master execution slice and the slave execution slice.
-
公开(公告)号:GB2494331B
公开(公告)日:2018-03-07
申请号:GB201221747
申请日:2011-05-04
Applicant: IBM
Inventor: RONALD HALL , BALARAM SINHAROY , HUNG QUI LE , RAUL ESTEBAN SILVERA
-
公开(公告)号:GB2604085B
公开(公告)日:2022-12-07
申请号:GB202209153
申请日:2020-11-19
Applicant: IBM
Inventor: HUNG QUI LE , BRIAN DAVID BARRICK , SUSAN EISEN , DUNG QUOC NGUYEN , ANDREAS WAGNER , BRIAN WILLIAM THOMPTO , KENNETH WARD , STEVEN BATTLE
Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one processor having a main register file, the main register file having a plurality of entries for storing data; one or more execution units including a dense math execution unit; and at least one accumulator register file, the at least one accumulator register file associated with the dense math execution unit. The processor in an embodiment is configured to process data in the dense math execution unit where the results of the dense math execution unit are written to a first group of one or more accumulator register file entries, and after a checkpoint boundary is crossed based upon, for example, the number “N” of instructions dispatched after the start of the checkpoint, the results of the dense math execution unit are written to a second group of one or more accumulator register file entries.
-
公开(公告)号:GB2549907B
公开(公告)日:2021-08-11
申请号:GB201712270
申请日:2015-12-29
Applicant: IBM
Inventor: SUNDEEP CHADHA , DAVID ALLEN HRUSECKY , DUNG QUOC NGUYEN , HUNG QUI LE , BRIAN WILLIAM THOMPTO , ROBERT ALLEN CORDES , SALMA AYUB
Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load-store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
-
公开(公告)号:GB2549906A
公开(公告)日:2017-11-01
申请号:GB201712265
申请日:2015-12-16
Applicant: IBM
Inventor: JEFFREY BROWNSCHEIDLE , DUNG QUOC NGUYEN , MAUREEN ANNE DELANEY , SUNDEEP CHADHA , HUNG QUI LE , BRIAN WILLIAM THOMPTO
IPC: G06F9/38
Abstract: An execution slice circuit for a processor core has multiple parallel instruction execution slices and provides flexible and efficient use of internal resources. The execution slice circuit includes a master execution slice for receiving instructions of a first instruction stream and a slave execution slice for receiving instructions of a second instruction stream and instructions of the first instruction stream that require an execution width greater than a width of the slices. The execution slice circuit also includes a control logic that detects when a first instruction of the first instruction stream has the greater width and controls the slave execution slice to reserve a first issue cycle for issuing the first instruction in parallel across the master execution slice and the slave execution slice.
-
-
-
-
-
-
-