CPC G06F 9/30036 (2013.01) [G06F 9/3001 (2013.01); G06F 9/30105 (2013.01)] | 24 Claims |
1. A processor comprising:
a hardware decoder to decode a first instruction to generate a decoded instruction;
a first source register to store a first plurality of packed real and imaginary data elements comprising a first plurality of complex numbers;
a second source register to store a second plurality of packed real and imaginary data elements comprising a second plurality of complex numbers, wherein each of the second plurality of complex numbers comprises a complex conjugate of a corresponding complex number of the first plurality of complex numbers; and
execution circuitry to execute the decoded instruction, the execution circuitry comprising:
at least one hardware multiplier to multiply selected real and imaginary data elements in the first source register and the second source register, wherein the at least one hardware multiplier is to multiply each selected imaginary data element in the first source register with a selected real data element in the second source register and to multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products;
at least one hardware adder to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and
accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result and to combine the second temporary result with second data from the destination register to generate a second final result and to store the first final result and the second final result back in the destination register.
|