Abstract:
PROBLEM TO BE SOLVED: To execute a function for multiplying two complex numbers by one multiply accumulate instruction. SOLUTION: A multiply-add circuit includes first (810), second (811), third (812), and fourth multipliers (813), wherein each of the multipliers receives a corresponding set of the above data elements. The multiply-add circuit further includes a first adder (850) coupled to the first and second multipliers (810 and 811), and second adder (851) coupled to the third and fourth multipliers (812 and 813). A third storage area (871) is coupled to the adders (850 and 851). The third storage area (871) includes a first and second field for saving output of the first and second adders (850 and 851), respectively, as first and second data elements of a third packed data. COPYRIGHT: (C)2006,JPO&NCIPI
Abstract:
A computer system which includes a multimedia input device which generates an audio or video input signal and a processor coupled to the multimedia input device. The system further includes a storage device coupled to the processor and having stored therein a signal processing routine for multiplying and accumulating input values representative of the audio or video input signal. The signal processing routine, when executed by the processor, causes the processor to perform several steps. These steps include performing a packed multiply add on a first set of values packed into a first source and a second set of values packed into a second source each representing input signals to generate a packed intermediate result. The packed intermediate result is added to an accumulator to generate a packed accumulated result in the accumulator. These steps may be iterated with the first set of values and portions of the second set of values to the accumulator to generate the packed accumulated result. Subsequent thereto, the packed accumulated result in the accumulator is unpacked into a first result and a second result and the first result and the second result are added together to generate an accumulated result.
Abstract:
A computer system which includes a multimedia input device which generates an audio or video input signal and a processor coupled to the multimedia input device. The system further includes a storage device coupled to the processor and having stored therein a signal processing routine for multiplying and accumulating input values representative of the audio or video input signal. The signal processing routine, when executed by the processor, causes the processor to perform several steps. These steps include performing a packed multiply add on a first set of values packed into a first source and a second set of values packed into a second source each representing input signals to generate a packed intermediate result. The packed intermediate result is added to an accumulator to generate a packed accumulated result in the accumulator. These steps may be iterated with the first set of values and portions of the second set of values to the accumulator to generate the packed accumulated result. Subsequent thereto, the packed accumulated result in the accumulator is unpacked into a first result and a second result and the first result and the second result are added together to generate an accumulated result.
Abstract:
A computer system which includes a multimedia input device which generates an audio or video input signal and a processor coupled to the multimedia input device. The system further includes a storage device coupled to the processor and having stored therein a signal processing routine for multiplying and accumulating input values representative of the audio or video input signal. The signal processing routine, when executed by the processor, causes the processor to perform several steps. These steps include performing a packed multiply add on a first set of values packed into a first source and a second set of values packed into a second source each representing input signals to generate a packed intermediate result. The packed intermediate result is added to an accumulator to generate a packed accumulated result in the accumulator. These steps may be iterated with the first set of values and portions of the second set of values to the accumulator to generate the packed accumulated result. Subsequent thereto, the packed accumulated result in the accumulator is unpacked into a first result and a second result and the first result and the second result are added together to generate an accumulated result.
Abstract:
A processor having a first and second storage having a first and second packed data, respectively. Each packed data includes a first, second, third, and fourth data element. A multiply-add circuit is coupled to the first and second storage areas. The multiply-add circuit includes a first (810), second (811), third (812), and fourth multiplier (813), wherein each of the multipliers receives a corresponding set of said data elements. The multiply-add circuit further includes a first adder (850) coupled to the first and second multipliers (810, 811), and second adder (851) coupled to the third and fourth multipliers (812, 813). A third storage area (871) is coupled to the adders (850, 851). The third storage area (871) includes a first and second field for saving output of the first and second adders (850, 851), respectively, as first and second data elements of a third packed data.
Abstract:
A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.