-
公开(公告)号:US10719316B2
公开(公告)日:2020-07-21
申请号:US15808800
申请日:2017-11-09
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein
IPC: G06F9/30
Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.
-
公开(公告)号:US10649733B2
公开(公告)日:2020-05-12
申请号:US16436901
申请日:2019-06-10
Applicant: Intel Corporation
Inventor: Cristina S. Anderson , Zeev Sperber , Simon Rubanovich , Benny Eitan , Amit Gradstein
Abstract: A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.
-
公开(公告)号:US10459728B2
公开(公告)日:2019-10-29
申请号:US15809721
申请日:2017-11-10
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein
Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.
-
公开(公告)号:US10157059B2
公开(公告)日:2018-12-18
申请号:US15280324
申请日:2016-09-29
Applicant: Intel Corporation
Inventor: Simon Rubanovich , Thierry Pons , Zeev Sperber , Amit Gradstein
Abstract: A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.
-
公开(公告)号:US10133577B2
公开(公告)日:2018-11-20
申请号:US13997791
申请日:2012-12-19
Applicant: Intel Corporation
Inventor: Jesus Corbal , Dennis R. Bradford , Jonathan C. Hall , Thomas D. Fletcher , Brian J. Hickmann , Dror Markovich , Amit Gradstein
Abstract: A processor includes an instruction schedule and dispatch (schedule/dispatch) unit to receive a single instruction multiple data (SIMD) instruction to perform an operation on multiple data elements stored in a storage location indicated by a first source operand. The instruction schedule/dispatch unit is to determine a first of the data elements that will not be operated to generate a result written to a destination operand based on a second source operand. The processor further includes multiple processing elements coupled to the instruction schedule/dispatch unit to process the data elements of the SIMD instruction in a vector manner, and a power management unit coupled to the instruction schedule/dispatch unit to reduce power consumption of a first of the processing elements configured to process the first data element.
-
公开(公告)号:US10089076B2
公开(公告)日:2018-10-02
申请号:US15922074
申请日:2018-03-15
Applicant: Intel Corporation
Inventor: Cristina S. Anderson , Amit Gradstein , Robert Valentine , Simon Rubanovich , Benny Eitan
Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.
-
公开(公告)号:US20180088943A1
公开(公告)日:2018-03-29
申请号:US15721799
申请日:2017-09-30
Applicant: Intel Corporation
Inventor: Seth Abraham , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Zeev Sperber , Amit Gradstein
IPC: G06F9/30
CPC classification number: G06F9/3001 , G06F9/30032 , G06F9/30036 , G06F9/30163 , G06F9/30167 , G06F9/3455
Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.
-
公开(公告)号:US20180088942A1
公开(公告)日:2018-03-29
申请号:US15721796
申请日:2017-09-30
Applicant: Intel Corporation
Inventor: Seth Abraham , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Zeev Sperber , Amit Gradstein
IPC: G06F9/30
CPC classification number: G06F9/3001 , G06F9/30032 , G06F9/30036 , G06F9/30163 , G06F9/30167 , G06F9/3455
Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.
-
公开(公告)号:US20170185379A1
公开(公告)日:2017-06-29
申请号:US14757942
申请日:2015-12-23
Applicant: Intel Corporation
Inventor: Cristina S. Anderson , Marius A. Cornea-Hasegan , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Nikita Astafev , Mark J. Charney , Milind B. Girkar , Amit Gradstein , Simon Rubanovich , Zeev Sperber
CPC classification number: G06F7/4876 , G06F7/485 , G06F7/49915
Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.
-
公开(公告)号:US20170123799A1
公开(公告)日:2017-05-04
申请号:US14930761
申请日:2015-11-03
Applicant: Intel Corporation
Inventor: Zeev Sperber , Tomer Weiner , Amit Gradstein , Simon Rubanovich , Alex Gerber
IPC: G06F9/30
CPC classification number: G06F9/30072 , G06F9/30101 , G06F9/3016 , G06F9/30167 , G06F9/3832 , G06F9/3836
Abstract: In one embodiment, a processor includes a fetch logic to fetch instructions, a decode logic to decode the instructions, and an execution logic to execute at least some of the instructions. The decode logic may identify a first instruction having a first immediate value, accumulate the first immediate value with a folded immediate value associated with a first operand of the first instruction, and prevent the first instruction from provision to the execution logic, such that the first instruction is not to be executed within the execution logic. Other embodiments are described and claimed.
-
-
-
-
-
-
-
-
-