Abstract:
PROBLEM TO BE SOLVED: To optimize advanced encryption standard (AES) encryption and decryption in parallel modes of operation. SOLUTION: The throughput of an encryption/decryption operation is increased in a system having a pipelined execution unit. Different independent encryptions (decryptions) of different data blocks may be performed in parallel by despatching an AES round instruction in each cycle. COPYRIGHT: (C)2009,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To efficiently process a floating point exception by a processor which executes an SIMD instruction.SOLUTION: Processing includes the steps of: specifying a numeral exception of an SIMD floating point operation; starting a first SIMD micro operation so as to generate a first Packed part result of the SIMD floating point operation; starting a second SIMD micro operation so as to generate a second Packed part result of the SIMD floating point operation; starting an SIMD unnormalization micro operation so as to put the first and second Packed part results together and to generate a third Packed result having an unnormalized element by unnormalizing a first element of the first and second Packed part results having been put together; storing the third Packed result of the SIMD floating point operation; and setting a flag for specifying the unnormalized element of the third Packed result to the first Packed part result.
Abstract:
PROBLEM TO BE SOLVED: To efficiently handle floating-point exceptions in a processor that executes single-instruction multiple-data (SIMD) instructions. SOLUTION: A method is provided for handling the floating-point exceptions in the processor that executes the SIMD instructions. The method includes: a step of initiating a first SIMD micro-operation to generate first packed partial results of the SIMD operation; a step of initiating a second SIMD micro-operation to generate second packed partial results of the SIMD operation; a step of initiating a SIMD denormalization micro-operation to combine the first and the second packed partial results and to generate third packed results having denormalized elements by denormalizing a first element of first and second packed partial results; a step of storing third packed results of the SIMD operation; and a step of setting a flag to identify denormalized elements of the third packed results in the first packed partial results. COPYRIGHT: (C)2010,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To efficiently handle floating point exceptions in a processor that executes SIMD instructions.SOLUTION: The method comprising: identifying a numerical exception for a SIMD floating point operation; initiating a first SIMD micro-operation to generate a first packed partial result for the SIMD floating point operation; initiating a second SIMD micro-operation to generate a second packed partial result for the SIMD floating point operation; initiating a SIMD denormalization micro-operation to combine the first and second packed partial results and to denormalize a first element of the combined first and second packed partial results to generate a third packed result having a denormal element; storing the third packed result for the SIMD floating point operation; and setting a flag identifying the denormal element of the third packed result in the first packed partial result.
Abstract:
Hier dargelegte Ausführungsformen betreffen Systeme und Verfahren zum Nullen eines Kachelregisterpaars. In einem Beispiel umfasst ein Prozessor Decodierschaltkreise zum Decodieren einer Matrixpaar-Nullungsanweisung mit Feldern für einen Opcode und einer Kennung zum Identifizieren einer Zielmatrix mit einem PAIR-Parameter gleich TRUE; und Ausführungsschaltkreise zum Ausführen der decodierten Matrixpaar-Nullungsanweisung zum Nullen jedes Elements einer linken Matrix und einer rechten Matrix der identifizierten Zielmatrix.
Abstract:
In one embodiment, the present invention includes a method for receiving a reciprocal instruction and an operand in a processor, accessing an entry of a lookup table based on a portion of the operand and the instruction, generating an encoder output based on a type of the reciprocal instruction and whether the reciprocal instruction is a legacy instruction, and selecting portions of the lookup table entry and input operand to be provided to a reciprocal logic unit based on the encoder output. Other embodiments are described and claimed.
Abstract:
A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.
Abstract:
Las realizaciones descritas se refieren a un procesador, un sistema en un chip y un sistema para ejecutar una instrucción de conversión de formato. En un ejemplo, un procesador que tiene una pluralidad de núcleos, incluido un núcleo que, en respuesta a una instrucción de conversión de formato que tiene un primer operando de origen que incluye un primer elemento de datos de punto flotante de precisión simple de 32 bits y un segundo operando de origen que incluye un segundo elemento de datos de punto flotante de precisión simple de 32 bits, debe: convertir el primer elemento de datos de punto flotante de precisión simple de 32 bits en un primer elemento de datos de punto flotante de 16 bits, en donde, cuando el primer elemento de datos de punto flotante de precisión simple de 32 bits es un elemento de datos normal, la conversión se debe realizar de acuerdo con un modo de redondeo especificado por la instrucción de conversión de formato, y el primer elemento de datos de punto flotante de 16 bits debe tener un bit de signo, un exponente de 8 bits, siete bits de mantisa explícitos y un bit de mantisa implícito, y en donde, cuando el primer elemento de datos de punto flotante de precisión simple de 32 bits es un elemento de datos que no es un número, NaN, el primer elemento de datos de punto flotante de 16 bits debe tener una mantisa con un máximo bit significativo establecido en uno; convertir el segundo elemento de datos de punto flotante de precisión simple de 32 bits en un segundo elemento de datos de punto flotante de 16 bits, donde, cuando el segundo elemento de datos de punto flotante de precisión simple de 32 bits es un elemento de datos normal, la conversión se debe realizar de acuerdo con el modo de redondeo, y el segundo elemento de datos de punto flotante de 16 bits debe tener un bit de signo, un exponente de 8 bits, siete bits de mantisa explícitos y un bit de mantisa implícito, y donde cuando el segundo elemento de datos de punto flotante de precisión simple de 32 bits es un elemento de datos NaN, el segundo elemento de datos de punto flotante de 16 bits debe tener una mantisa con un bit más significativo establecido en uno; y almacenar el primer elemento de datos de punto flotante de 16 bits en una mitad de orden inferior de un registro de destino y el segundo elemento de datos de punto flotante de 16 bits en una mitad de orden superior del registro de destino. (Traducción automática con Google Translate, sin valor legal)
Abstract:
Disclosed embodiments relate to a processor and a method for executing a format conversion instruction. In one example, a processor comprises a decode unit to decode the format conversion instruction and an execution unit to execute the decoded format conversion instruction. The format conversion instruction indicates a location of a first source operand, a location of a second source operand, a destination register, a writemask register, and a type of masking, the first source operand to include a first plurality of 32-bit single-precision floating point data elements, the second source operand to include a second plurality of 32-bit single-precision floating point data elements, the writemask register to store a plurality of mask bits each corresponding to a data element position in the destination register, the type of masking to be either zeroing masking or merging masking.