| CPC G06F 9/30014 (2013.01) [G06F 7/5443 (2013.01); G06F 9/30036 (2013.01); G06F 9/30038 (2023.08); G06F 9/30145 (2013.01)] | 30 Claims | 

| 
               1. An apparatus comprising: 
            decoder circuitry to decode a single instruction, the single instruction to include fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand; and 
                execution circuitry to execute the decoded single instruction according to the opcode. 
               |