Patent search ap:("INTEL CORPORATION") AND inv:"Christopher Hughes" Page 2

11.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 8-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20240045689A1

公开(公告)日：2024-02-08

申请号：US17958377

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC: G06F9/30 , G06F7/487 , G06F17/16 , G06F9/38

CPC classification number: G06F9/3016 , G06F7/4876 , G06F17/16 , G06F9/3802 , G06F9/3013 , G06F9/3001

Abstract: Disclosed embodiments relate to systems and methods for performing 8-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply pairs of 8-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

12.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：US20240045684A1

公开(公告)日：2024-02-08

申请号：US17958380

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Mark Charney , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber , Robert Valentine

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/30018

Abstract: Techniques for converting FP16 to BF8 using bias are described. An example embodiment utilizes decoder circuitry to decode a single instruction, the single instruction to include one or more fields to identify a first source operand, one or more fields to identify a second source operand, one or more fields to identify a source/destination operand, and one or more fields for an opcode, wherein the opcode is to indicate that execution circuitry is to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand.

13.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR MOVING DATA BETWEEN TILES OF A MATRIX OPERATIONS ACCELERATOR AND VECTOR REGISTERS 有权

公开(公告)号：US20210406018A1

公开(公告)日：2021-12-30

申请号：US16914347

申请日：2020-06-27

Applicant: INTEL CORPORATION

Inventor： Menachem Adelman , Robert Valentine , Barukh Ziv , Yaroslav Pollak , Gideon Stupp , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark Charney , Christopher Hughes , Alexander Heinecke

IPC: G06F9/30 , G06F17/16

Abstract: Systems, methods, and apparatuses relating to one or more instructions that utilize direct paths for loading data into a tile from a vector register and/or storing data from a tile into a vector register are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a plurality of registers that represents a two-dimensional matrix coupled to the two-dimensional grid of processing elements, and a coupling to a cache; and a hardware processor core comprising: a vector register, a decoder to decode a single instruction into a decoded single instruction, the single instruction including a first field that identifies the two-dimensional matrix, a second field that identifies a set of elements of the two-dimensional matrix, and a third field that identifies the vector register, and an execution circuit to execute the decoded single instruction to cause a store of the set of elements from the plurality of registers that represents the two-dimensional matrix into the vector register by a coupling of the hardware processor core to the matrix operations accelerator circuit that is separate from the coupling to the cache.

14.

发明公开
8-BIT FLOATING POINT FUSED MULTIPLY INSTRUCTIONS 审中-公开

公开(公告)号：US20240045688A1

公开(公告)日：2024-02-08

申请号：US17958369

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC: G06F9/30 , G06F7/487

CPC classification number: G06F9/3016 , G06F7/4876 , G06F9/3001

Abstract: Techniques for performing FP8 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a FP8 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand, wherein the FP8 value has an 8-bit floating point format that comprises one bit for a sign, at least 4 bits for an exponent, and at least two bits for a fraction.

15.

发明公开
INSTRUCTIONS TO CONVERT FROM FP8 审中-公开

公开(公告)号：US20240045686A1

公开(公告)日：2024-02-08

申请号：US17958382

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30025

Abstract: Techniques for converting FP8 data elements to FP16 or FP32 data elements using a single instruction are described. An example apparatus includes decoder circuitry to decode a single instruction, the single instruction to indicate that execution circuitry is to convert packed FP8 data from the identified source to packed half-precision floating-point data or single-precision floating point data and store the packed half-precision floating-point data or single-precision floating point data into corresponding data element positions of the identified destination operand.

16.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR STRUCTURED-SPARSE TILE MATRIX FMA 审中-公开

公开(公告)号：US20240045685A1

公开(公告)日：2024-02-08

申请号：US17958381

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Menachem Adelman , Amit Gradstein , Alexander Heinecke , Christopher Hughes , Naveen Mellempudi , Shahar Mizrahi , Dana Rip , Simon Rubanovich , Uri Sherman , Guy Boudoukh , Evangelos Georganas , Nilesh Jain , Barukh Ziv

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30025 , G06F9/3001

Abstract: Systems, methods, and apparatuses relating sparsity based FMA. In some examples, an instance of a single FMA instruction has one or more fields for an opcode, one or more fields to identify a source/destination matrix operand, one or more fields to identify a first plurality of source matrix operands, one or more fields to identify a second plurality of matrix operands, wherein the opcode is to indicate that execution circuitry is to select a proper subset of FP8 data elements from the first plurality of source matrix operands based on sparsity controls from a first matrix operand of the second plurality of matrix operands and perform a FMA.

17.

发明公开
8-BIT FLOATING POINT COMPARISON INSTRUCTIONS 审中-公开

公开(公告)号：US20240045681A1

公开(公告)日：2024-02-08

申请号：US17958367

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/30094

Abstract: Techniques for comparing FP8 data elements are described. An exemplary FP8 comparison instruction includes fields for an opcode, an identification of a location of a first packed data source operand, and an identification of a location of a second packed data source operand, wherein the opcode is to indicate that execution circuitry is to perform, for a particular data element position of the packed data source operands, a comparison of a data element at that position, and update a flags register based on the comparison.

18.

发明申请
INSTRUCTIONS TO CONVERT FROM FP16 TO BF8 有权

公开(公告)号：US20220206805A1

公开(公告)日：2022-06-30

申请号：US17134353

申请日：2020-12-26

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Naveen Mellempudi , Robert Valentine , Mark Charney , Christopher Hughes , Evangelos Georganas , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC: G06F9/30 , G06F7/499 , H03M7/24

Abstract: Techniques for converting FP16 data elements to BF8 data elements using a single instruction are described. An exemplary apparatus includes decoder circuitry to decode a single instruction, the single instruction to include a one or more fields to identify a source operand, one or more fields to identify a destination operand, and one or more fields for an opcode, the opcode to indicate that execution circuitry is to convert packed half-precision floating-point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions of the identified destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision floating-point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions.

19.

发明授权
Method and apparatus for approximation using polynomials 有权

公开(公告)号：US11327754B2

公开(公告)日：2022-05-10

申请号：US16366941

申请日：2019-03-27

Applicant: Intel Corporation

Inventor： Jorge Parra , Dan Baum , Robert S. Chappell , Michael Espig , Varghese George , Alexander Heinecke , Christopher Hughes , Subramaniam Maiyuran , Prasoonkumar Surti , Ronen Zohar , Elmoustapha Ould-Ahmed-Vall

IPC: G06F9/30 , G06F17/11 , G06F7/544 , G06F9/38 , G06F7/552

Abstract: Methods and apparatus for approximation using polynomial functions are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry is to decode an instruction, where the instruction comprises a first operand specifying an output location and a second operand specifying a plurality of data element values to be computed. The execution circuitry is to execute the decoded instruction. The execution includes to compute a result for each of the plurality of data element values using a polynomial function to approximate a complex function, where the computation uses coefficients stored in a lookup location for the complex function, and where data element values within different data element value ranges use different sets of coefficients. The execution further includes to store results of the computation in the output location.

20.

发明公开
8-BIT FLOATING POINT SQUARE ROOT AND/OR RECIPROCAL SQUARE ROOT INSTRUCTIONS 审中-公开

公开(公告)号：US20240045683A1

公开(公告)日：2024-02-08

申请号：US17958371

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/3001

Abstract: Techniques for performing square root or reciprocal square root calculations on FP8 data elements in response to an instruction are described. An example of an instruction is one that includes fields for an opcode, an identification of a location of a packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a calculation of a square root value of a FP8 data element in that position and store a result of each square root into a corresponding data element position of the packed data destination operand.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification