METHODS, APPARATUS, INSTRUCTIONS AND LOGIC TO PROVIDE VECTOR PACKED TUPLE CROSS-COMPARISON FUNCTIONALITY

    公开(公告)号:SG11201704466QA

    公开(公告)日:2017-07-28

    申请号:SG11201704466Q

    申请日:2015-12-14

    Applicant: INTEL CORP

    Abstract: Instructions and logic provide SIMD vector packed tuple cross-comparison functionality. Some processor embodiments include first and second registers with a variable plurality of data fields, each of the data fields to store an element of a first data type. The processor executes a SIMD instruction for vector packed tuple cross-comparison in some embodiments, which for each data field of a portion of data fields in a tuple of the first register, compares its corresponding element with every element of a corresponding portion of data fields in a tuple of the second register and sets a mask bit corresponding to each element of the second register portion, in a bit-mask corresponding to each unmasked element of the corresponding first register portion, according to the corresponding comparison. In some embodiments bit-masks are shifted by corresponding elements in data fields of a third register. The comparison type is indicated by an immediate operand.

    Instruction and logic to provide vector compress and rotate functionality

    公开(公告)号:GB2507655B

    公开(公告)日:2015-06-24

    申请号:GB201318167

    申请日:2013-10-14

    Applicant: INTEL CORP

    Abstract: Instructions and logic provide vector compress and rotate functionality. A processor may include a mask register, a decoder, and an execution unit. The mask register may include a data field, wherein the data field corresponds to an element location in a vector. The decoder may be coupled to the mask register. The decoder may decode an instruction to obtain a decoded instruction. The decoded instruction may specify a vector source, the mask register, a vector destination, and a vector destination offset location. The execution unit is coupled to the decoder. The execution unit may read an unmasked value in the data field; copy an vector element from the vector source to a location adjacent to the element; change the unmasked value to a masked value; determine that the vector destination is full; store a vector destination operand associated with the vector destination in a memory; and re-execute the instruction using the masked value and the vector destination offset location.

    Limited range vector memory access instructions, processors, methods, and systems

    公开(公告)号:GB2513970A

    公开(公告)日:2014-11-12

    申请号:GB201403976

    申请日:2014-03-06

    Applicant: INTEL CORP

    Abstract: A processor comprises a plurality of packed data registers 207, and an execution unit 209 coupled to the registers that, in response to a limited range vector memory access instruction 203; accesses memory locations in only a restricted or limited range 220 of memory 210. The limited range vector memory access instruction indicates source packed memory indices 213 which include a plurality of 8-bit and/or 16-bit memory indices, and these indices specify the memory locations to be accessed. The registers might also include source packed data (to be stored in the limited memory in response to scatter operations); or destination storage locations for packed data (to load data in response to the load operations). The instruction might also indicate a source packed data operation mask 216 to prevent some of the bits from being stored or loaded.

    Instructions and logic to vectorize conditional loops

    公开(公告)号:GB2511198A

    公开(公告)日:2014-08-27

    申请号:GB201323062

    申请日:2013-12-27

    Applicant: INTEL CORP

    Abstract: SIMD vectorisation of conditional loops is provided. A vector of counts is initialized (1610) to n count values, the vector having n data fields to store elements having a partition size of m bytes (e.g. 4-byte double words); a decision vector is obtained (1620) and used to generate a mask (1630); a vector expand instruction is received (1640), which has the count vector as a source, uses the generated mask and specifies a destination vector of n elements, each having a size of m bytes; the instruction causes the copying of consecutive source vector data into unmasked destination vector elements (1620). n varies according to the received instruction. Masked elements of the destination vector are set to zero. Counts of the condition decisions are also stored (1660, 1670). One application is processing loops in benchmark suites for online clustering based on finding medians to assign points to their nearest centre.

    Instruktion und Logik zum Bereitstellen einer Vektorstreuungs-Op- und -Hol-Op-Funktionalität

    公开(公告)号:DE112011105664T5

    公开(公告)日:2014-08-21

    申请号:DE112011105664

    申请日:2011-09-26

    Applicant: INTEL CORP

    Abstract: Instruktionen und Logik stellen eine Vektorstreuungs-Op- und/oder -Hol-Op-Funktionalität bereit. In einigen Ausführungsformen lesen Ausführungseinheiten in Reaktion auf eine Instruktion, die eine Hol- und eine zweite Operation, ein Zielregister, ein Operandenregister und eine Speicheradresse spezifiziert, Werte in einem Maskenregister, wobei Felder in dem Maskenregister Versatzindizes in dem Indizesregister für Datenelemente im Speicher entsprechen. Ein erster Maskenwert gibt an, dass das Element nicht aus dem Speicher geholt wurde, und ein zweiter Wert gibt an, dass das Element nicht geholt zu werden braucht oder bereits geholt wurde. Für jedes mit dem ersten Wert wird das Datenelement aus dem Speicher in die entsprechende Zielregisterposition geholt, und der entsprechende Wert in dem Maskenregister wird zu dem zweiten Wert geändert. Wenn alle Maskenregisterfelder den zweiten Wert haben, so wird die zweite Operation unter Verwendung entsprechender Daten in den Zielort- und Operandenregistern ausgeführt, um Ergebnisse zu generieren.

Patent Agency Ranking