CPC G06F 17/16 (2013.01) | 18 Claims |
1. A processor comprising:
a decoder to decode a first complex matrix multiplication instruction including a first source operand to identify a first complex source matrix comprising a first plurality of 32-bit floating point (FP32) complex values, a second source operand to identify a second complex source matrix comprising a second plurality of FP32 complex values, and a first destination operand to identify an FP32 result matrix;
execution circuitry to execute the first complex matrix multiplication instruction, the execution circuitry comprising:
parallel multiplication circuitry to:
multiply 16-bit floating point (FP16) real values from the first plurality of FP32 complex values with corresponding FP16 real values from the second plurality of FP32 complex values to generate a first plurality of real products, and
multiply FP16 imaginary values from the first plurality of FP32 complex values with corresponding FP16 imaginary values from the second plurality of FP32 complex values to generate a second plurality of real products; and
addition/subtraction circuitry to subtract each real product in the second plurality of real products from a corresponding real product in the first plurality of real products using round-to-nearest-even (RNE) rounding to produce a corresponding real value in the result matrix;
wherein denormal FP16 real or imaginary values are not set to zero by the execution circuitry; and
wherein the execution circuitry does not raise or denote exceptions due to the first complex matrix multiplication instruction.
|