Instruction and logic for early underflow detection and rounder bypass

    公开(公告)号:US10157059B2

    公开(公告)日:2018-12-18

    申请号:US15280324

    申请日:2016-09-29

    Abstract: A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.

    Leading change anticipator logic
    2.
    发明授权
    Leading change anticipator logic 有权
    领先的变化预测逻辑

    公开(公告)号:US09274752B2

    公开(公告)日:2016-03-01

    申请号:US13729421

    申请日:2012-12-28

    CPC classification number: G06F7/74 G06F5/012 G06F7/485

    Abstract: In one embodiment, a processor includes at least one floating point unit. The at least one floating point unit may include an adder, leading change anticipator (LCA) logic, and a shifter. The adder may be to add a first operand X and a second operand Y to obtain an output operand having a bit length n. The LCA logic may be to: for each bit position i from n−1 to 1, obtain a set of propagation values and a set of bit values based on the first operand X and the second operand Y; and generate a LCA mask based on the set of propagation values and the set of bit values. The shifter may be to normalize the output operand based on the LCA mask. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,处理器包括至少一个浮点单元。 所述至少一个浮点单元可以包括加法器,引导改变预测器(LCA)逻辑和移位器。 加法器可以添加第一操作数X和第二操作数Y以获得具有位长度n的输出操作数。 LCA逻辑可以是:对于从n-1到1的每个比特位置i,基于第一操作数X和第二操作数Y获得一组传播值和一组比特值; 并且基于传播值集合和位值集合来生成LCA掩码。 移位器可以是基于LCA掩码来规范化输出操作数。 描述和要求保护其他实施例。

    Efficient implementation of complex vector fused multiply add and complex vector multiply

    公开(公告)号:US11455167B2

    公开(公告)日:2022-09-27

    申请号:US16701082

    申请日:2019-12-02

    Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

    Efficient implementation of complex vector fused multiply add and complex vector multiply

    公开(公告)号:US10521226B2

    公开(公告)日:2019-12-31

    申请号:US15941531

    申请日:2018-03-30

    Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

    Instruction and Logic for Early Underflow Detection and Rounder Bypass

    公开(公告)号:US20180088940A1

    公开(公告)日:2018-03-29

    申请号:US15280324

    申请日:2016-09-29

    CPC classification number: G06F9/30014 G06F7/00 G06F7/483 G06F7/5443

    Abstract: A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.

Patent Agency Ranking