In-lane vector shuffle instructions
    111.
    发明授权

    公开(公告)号:US10514917B2

    公开(公告)日:2019-12-24

    申请号:US15801652

    申请日:2017-11-02

    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    In-lane vector shuffle instructions
    112.
    发明授权

    公开(公告)号:US10514916B2

    公开(公告)日:2019-12-24

    申请号:US15613809

    申请日:2017-06-05

    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    Thread pause processors, methods, systems, and instructions

    公开(公告)号:US10467011B2

    公开(公告)日:2019-11-05

    申请号:US14336596

    申请日:2014-07-21

    Abstract: A processor of an aspect includes a decode unit to decode a thread pause instruction from a first thread. A back-end portion of the processor is coupled with the decode unit. The back-end portion of the processor, in response to the thread pause instruction, is to pause processing of subsequent instructions of the first thread for execution. The subsequent instructions occur after the thread pause instruction in program order. The back-end portion, in response to the thread pause instruction, is also to keep at least a majority of the back-end portion of the processor, empty of instructions of the first thread, except for the thread pause instruction, for a predetermined period of time. The majority may include a plurality of execution units and an instruction queue unit.

    IN-LANE VECTOR SHUFFLE INSTRUCTIONS
    118.
    发明申请

    公开(公告)号:US20180121198A1

    公开(公告)日:2018-05-03

    申请号:US15801652

    申请日:2017-11-02

    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

    Instruction and Logic for Early Underflow Detection and Rounder Bypass

    公开(公告)号:US20180088940A1

    公开(公告)日:2018-03-29

    申请号:US15280324

    申请日:2016-09-29

    CPC classification number: G06F9/30014 G06F7/00 G06F7/483 G06F7/5443

    Abstract: A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.

Patent Agency Ranking