Abstract:
PROBLEM TO BE SOLVED: To provide a method, a device and a system for performing a packed multiply high with round and shift processing, and to provide a machine readable medium. SOLUTION: The method comprises: a step for receiving a 1st operand having a 1st set of L data elements; a step for receiving a 2nd operand having a 2nd set of L data elements; a step for generating a set of L products by mutually multiplying L data element pairs each of which has the 1st data element from the 1st set of L data elements and the 2nd data element from a corresponding data element position of the 2nd set consisting of L data elements, a step for generating L rounded values by respectively rounding the L products, a step for generating L scaled values by respectively scaling the L rounded values; and a step for performing the round-off processing of each of the L scaled values so as to store the round-off value in a data element position corresponding to the pair of the data element. COPYRIGHT: (C)2005,JPO&NCIPI
Abstract:
A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.
Abstract:
A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.