Patent search ap:("INTEL CORPORATION") AND inv:"Mark Charney" Page 6

51.

发明申请
APPARATUS AND METHOD FOR CONVERTING A FLOATING-POINT VALUE FROM HALF PRECISION TO SINGLE PRECISION 审中-公开

公开(公告)号：US20190163474A1

公开(公告)日：2019-05-30

申请号：US15824339

申请日：2017-11-28

Applicant: Intel Corporation

Inventor： Robert Valentine , Mark Charney , Raanan Sade , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal

IPC: G06F9/30

Abstract: An embodiment of the invention is a processor including execution circuitry to, in response to a decoded instruction, convert a half-precision floating-point value to a single-precision floating-point value and store the single-precision floating-point value in each of the plurality of element locations of a destination register. The processor also includes a decoder and the destination register. The decoder is to decode an instruction to generate the decoded instruction.

52.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR MULTIPLICATION AND ACCUMULATION OF VECTOR PACKED SIGNED VALUES 审中-公开

公开(公告)号：US20190102198A1

公开(公告)日：2019-04-04

申请号：US15721616

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara R. Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30

Abstract: Embodiments of systems, apparatuses, and methods for multiplication and accumulation of signed data values in a processor are described. For example, execution circuitry executes a decoded instruction to multiply selected signed data values from a plurality of packed data element positions in first and second packed data source operands to generate a plurality of first signed result values, sum the plurality of first signed result values to generate one or more second signed result values, accumulate the one or more signed result values with one or more data values from a destination operand to generate one or more third signed result values, and store the one or more third signed result values in one or more packed data element positions in the destination operand.

53.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR MULTIPLICATION, NEGATION, AND ACCUMULATION OF VECTOR PACKED SIGNED VALUES 审中-公开

公开(公告)号：US20190102185A1

公开(公告)日：2019-04-04

申请号：US15721599

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara R. Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Mark Charney

IPC: G06F9/30 , G06F7/48 , G06F7/544 , G06F17/16

Abstract: Embodiments of systems, apparatuses, and methods for multiplication, negation, and accumulation of data values in a processor are described. For example, execution circuitry executes a decoded instruction to multiply selected data values from a plurality of packed data element positions in first and second packed data source operands to generate a plurality of first result values, sum the plurality of first result values to generate one or more second result values, negate the one or more second result values to generate one or more third result values, accumulate the one or more third result values with one or more data values from the destination operand to generate one or more fourth result values, and store the one or more third result values in one or more packed data element positions in the destination operand.

54.

发明授权
Instruction and logic for sum of square differences 有权

公开(公告)号：US12099838B2

公开(公告)日：2024-09-24

申请号：US17132464

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Deepti Aggarwal , Michael Espig , Chekib Nouira , Robert Valentine , Mark Charney

IPC: G06F17/18 , G06F9/30 , G06F9/38 , G06F17/16

CPC classification number: G06F9/3001 , G06F9/3802 , G06F9/3818 , G06F17/16 , G06F17/18

Abstract: In an embodiment, a processor includes: a fetch circuit to fetch instructions, the instructions including a sum of squared differences (SSD) instruction; a decode circuit to decode the SSD instruction; and an execution circuit to, during an execution of the decoded SSD instruction, generate an SSD output vector based on a plurality of input vectors, the SSD output vector including a plurality of squared differences values. Other embodiments are described and claimed.

55.

发明授权
Apparatuses, methods, and systems for instructions for downconverting a tile row and interleaving with a register 有权

公开(公告)号：US12086595B2

公开(公告)日：2024-09-10

申请号：US17214853

申请日：2021-03-27

Applicant: Intel Corporation

Inventor： Menachem Adelman , Robert Valentine , Amit Gradstein , Daniel Towner , Mark Charney

IPC: G06F9/30

CPC classification number: G06F9/3016 , G06F9/30025 , G06F9/30098

Abstract: Systems, methods, and apparatuses relating to interleaving data values. An embodiment includes decoding circuitry to decode a single instruction, the instruction having one or more fields to specify an opcode, one or more fields to specify a location of a first source operand, one or more fields to specify a location of a second source operand, one or more fields to specify a location of a destination operand, and one or more fields to specify an index value to be used to index a row in the first source operand, wherein the opcode is to indicate execution circuitry is to downconvert data elements of the indexed row of the first source operand, interleave the downconverted elements with data elements of the second source operand, and store the interleaved elements in the destination operand; and execution circuitry to execute the decoded instruction according to the opcode.

56.

发明授权
Apparatuses, methods, and systems for instructions to request a history reset of a processor core 有权

公开(公告)号：US11966742B2

公开(公告)日：2024-04-23

申请号：US18311810

申请日：2023-05-03

Applicant: Intel Corporation

Inventor： Eliezer Weissmann , Mark Charney , Michael Mishaeli , Robert Valentine , Itai Ravid , Jason W. Brandt , Gilbert Neiger , Baruch Chaikin , Efraim Rotem

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3851 , G06F9/30043 , G06F9/30076 , G06F9/30101 , G06F9/3836 , G06F9/3842

Abstract: Systems, methods, and apparatuses relating to instructions to reset software thread runtime property histories in a hardware processor are described. In one embodiment, a hardware processor includes a hardware guide scheduler comprising a plurality of software thread runtime property histories; a decoder to decode a single instruction into a decoded single instruction, the single instruction having a field that identifies a model-specific register; and an execution circuit to execute the decoded single instruction to check that an enable bit of the model-specific register is set, and when the enable bit is set, to reset the plurality of software thread runtime property histories of the hardware guide scheduler.

57.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：US20240045677A1

公开(公告)日：2024-02-08

申请号：US17958378

申请日：2022-10-01

Applicant: Intel Corporation

Inventor： Alexander Heinecke , Menachem Adelman , Mark Charney , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber , Robert Valentine

IPC: G06F9/30

CPC classification number: G06F9/30025 , G06F9/3016

Abstract: Techniques for converting FP16 or FP32 data elements to FP8 data elements using a single instruction are described. An exemplary apparatus includes decoder circuitry to decode a single instruction, the single instruction to include a one or more fields to identify a source operand, one or more fields to identify a destination operand, and one or more fields for an opcode, the opcode to indicate that execution circuitry is to convert packed half-precision floating-point data or single-precision floating point data from the identified source to packed FP8 data and store the packed bfloat8 data into corresponding data element positions of the identified destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision floating-point data or single-precision floating point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions.

58.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US11809867B2

公开(公告)日：2023-11-07

申请号：US17027230

申请日：2020-09-21

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30 , G06F7/00

CPC classification number: G06F9/3001 , G06F7/00 , G06F9/30014 , G06F9/3016 , G06F9/30036

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed byte data elements; a second source register to store a second plurality of packed byte data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to concurrently multiply each of the packed byte data elements of the first plurality with a corresponding packed byte data element of the second plurality to generate a plurality of products; adder circuitry to add specified sets of the products to generate temporary results for each set of products; zero-extension or sign-extension circuitry to zero-extend or sign-extend the temporary result for each set to generate an extended temporary result for each set; accumulation circuitry to combine each of the extended temporary results with a selected packed data value stored in a third source register to generate a plurality of final results; and a destination register to store the plurality of final results as a plurality of packed data elements in specified data element positions.

59.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US11573799B2

公开(公告)日：2023-02-07

申请号：US17226986

申请日：2021-04-09

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Mark Charney , Robert Valentine , Jesus Corbal , Binwei Yang

IPC: G06F9/30

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed doubleword data elements; a second source register to store a second plurality of packed doubleword data elements; and execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply a first doubleword data element from the first source register with a second doubleword data element from the second source register to generate a first quadword product and to concurrently multiply a third doubleword data element from the first source register with a fourth doubleword data element from the second source register to generate a second quadword product; and a destination register to store the first quadword product and the second quadword product as first and second packed quadword data elements.

60.

发明授权
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements 有权

公开(公告)号：US10802826B2

公开(公告)日：2020-10-13

申请号：US15721412

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements. For example one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed byte data elements; a second source register to store a second plurality of packed byte data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to concurrently multiply each of the packed byte data elements of the first plurality with a corresponding packed byte data element of the second plurality to generate a plurality of products; adder circuitry to add specified sets of the products to generate temporary results for each set of products; zero-extension or sign-extension circuitry to zero-extend or sign-extend the temporary result for each set to generate an extended temporary result for each set; accumulation circuitry to combine each of the extended temporary results with a selected packed data value stored in a third source register to generate a plurality of final results; and a destination register to store the plurality of final results as a plurality of packed data elements in specified data element positions.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification