Patent search ap:("INTEL CORPORATION") AND inv:"Bret Toll" Page 6

51.

发明授权
Systems, methods, and apparatuses for dot product operations 有权

公开(公告)号：US11669326B2

公开(公告)日：2023-06-06

申请号：US15859271

申请日：2017-12-29

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F9/30 , G06F17/16

CPC classification number: G06F9/30014 , G06F9/30109 , G06F9/30145 , G06F17/16

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a quadword data elements of a matrix pair. Additionally, in some instances, non-accumulating quadword data elements of the matrix pair are set to zero.

52.

发明授权
Systems and methods to load a tile register pair 有权

公开(公告)号：US11609762B2

公开(公告)日：2023-03-21

申请号：US17398927

申请日：2021-08-10

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F9/30 , G06F9/38

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

53.

发明授权
Systems and methods for performing horizontal tile operations 有权

公开(公告)号：US11579883B2

公开(公告)日：2023-02-14

申请号：US16131382

申请日：2018-09-14

Applicant: Intel Corporation

Inventor： Christopher J. Hughes , Bret Toll , Dan Baum , Elmoustapha Ould-Ahmed-Vall , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.

54.

发明授权
Systems, apparatuses, and methods for controllable sine and/or cosine operations 有权

公开(公告)号：US11579871B2

公开(公告)日：2023-02-14

申请号：US17346891

申请日：2021-06-14

Applicant: Intel Corporation

Inventor： Venkateswara R. Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark J. Charney , Carl Murray , Milind Girkar , Bret Toll

IPC: G06F9/30 , G06F7/548

Abstract: Embodiments of systems, apparatuses, and methods for performing vector-packed controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.

55.

发明申请
SYSTEMS AND METHODS TO LOAD A TILE REGISTER PAIR 有权

公开(公告)号：US20220091848A1

公开(公告)日：2022-03-24

申请号：US17398927

申请日：2021-08-10

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

56.

发明授权
Systems and methods for performing instructions specifying ternary tile logic operations 有权

公开(公告)号：US10970076B2

公开(公告)日：2021-04-06

申请号：US16131376

申请日：2018-09-14

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Christopher J. Hughes , Bret Toll , Dan Baum , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke

IPC: G06F9/38 , G06F9/30 , G06F17/16

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.

57.

发明授权
Systems and methods for implementing chained tile operations 有权

公开(公告)号：US10664287B2

公开(公告)日：2020-05-26

申请号：US15942201

申请日：2018-03-30

Applicant: Intel Corporation

Inventor： Christopher J. Hughes , Alexander F. Heinecke , Robert Valentine , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall

IPC: G06F9/30 , G06F15/80 , G06F17/16 , G06F9/302 , G06F9/312 , G06F9/38 , G06F15/78

Abstract: Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.

58.

发明申请
SYSTEMS AND METHODS TO ZERO A TILE REGISTER PAIR 审中-公开

公开(公告)号：US20190042256A1

公开(公告)日：2019-02-07

申请号：US15858947

申请日：2017-12-29

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Eyal Hadas

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.

59.

发明授权
Multiple register memory access instructions, processors, methods, and systems 有权

公开(公告)号：US10170165B2

公开(公告)日：2019-01-01

申请号：US15855626

申请日：2017-12-27

Applicant: Intel Corporation

Inventor： Glenn Hinton , Bret Toll , Ronak Singhal

IPC: G06F9/30 , G11C7/10

Abstract: A processor includes N-bit registers and a decode unit to receive a multiple register memory access instruction. The multiple register memory access instruction is to indicate a memory location and a register. The processor includes a memory access unit coupled with the decode unit and with the N-bit registers. The memory access unit is to perform a multiple register memory access operation in response to the multiple register memory access instruction. The operation is to involve N-bit data, in each of the N-bit registers comprising the indicated register. The operation is also to involve different corresponding N-bit portions of an M×N-bit line of memory corresponding to the indicated memory location. A total number of bits of the N-bit data in the N-bit registers to be involved in the multiple register memory access operation is to amount to at least half of the M×N-bits of the line of memory.

60.

发明授权
Multiple register memory access instructions, processors, methods, and systems 有权

公开(公告)号：US10102888B2

公开(公告)日：2018-10-16

申请号：US15728293

申请日：2017-10-09

Applicant: Intel Corporation

Inventor： Glenn Hinton , Bret Toll , Ronak Singhal

IPC: G06F7/00 , G11C7/10 , G06F9/30

Abstract: A processor includes N-bit registers and a decode unit to receive a multiple register memory access instruction. The multiple register memory access instruction is to indicate a memory location and a register. The processor includes a memory access unit coupled with the decode unit and with the N-bit registers. The memory access unit is to perform a multiple register memory access operation in response to the multiple register memory access instruction. The operation is to involve N-bit data, in each of the N-bit registers comprising the indicated register. The operation is also to involve different corresponding N-bit portions of an M×N-bit line of memory corresponding to the indicated memory location. A total number of bits of the N-bit data in the N-bit registers to be involved in the multiple register memory access operation is to amount to at least half of the M×N-bits of the line of memory.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification