CPC G06F 9/30014 (2013.01) [G06F 7/5443 (2013.01); G06F 9/30018 (2013.01); G06F 9/30036 (2013.01); G06F 9/30105 (2013.01); G06F 9/3818 (2013.01)] | 20 Claims |
1. An apparatus comprising:
decode circuitry to decode a single instruction, the single instruction having fields to indicate an opcode, a packed destination operand, a first packed source operand, and a second packed source operand, wherein elements of the destination are 32 bits in size and elements of the first source and the second source are 16 bits in size;
a register file having a plurality of packed data registers including registers for the destination and source operands; and
execution circuitry, coupled to the decode circuitry, the execution circuitry to perform operations corresponding to the instruction, including to, for each element position of the destination:
multiply a first element from the first source and a first element from the second source to generate a first result,
multiply a second element from the first source and a second element from the second source to generate a second result,
add the first result and the second result to generate a third result;
add the third result to an element from the element position of the destination to generate a fourth result, and
store the fourth result in the element position of the destination.
|