US 12,379,927 B2
BFLOAT16 scale and/or reduce instructions
Menachem Adelman, Haifa (IL); Alexander Heinecke, San Jose, CA (US); Robert Valentine, Kiryat Tivon (IL); Zeev Sperber, Zikhron Yaakov (IL); Amit Gradstein, Binyamina (IL); Mark Charney, Lexington, MA (US); Evangelos Georganas, San Mateo, CA (US); Dhiraj Kalamkar, Bangalore (IN); Christopher Hughes, Santa Clara, CA (US); and Cristina Anderson, Hillsboro, OR (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Aug. 31, 2021, as Appl. No. 17/463,382.
Prior Publication US 2023/0068781 A1, Mar. 2, 2023
Int. Cl. G06F 9/30 (2018.01)
CPC G06F 9/30145 (2013.01) [G06F 9/30036 (2013.01); G06F 9/30038 (2023.08); G06F 9/30101 (2013.01); G06F 9/30014 (2013.01)] 35 Claims
OG exemplary drawing
 
1. An apparatus comprising:
decode circuitry to decode an instance of a single instruction, the single instruction to include fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operands, a floating point scale operation of a BF16 data element of the first packed data source by multiplying the data element by a power of 2 value, wherein a value of an exponent of the power of 2 value is a floor value of a BF16 data element of the second packed data source, and store a result of the floating point scale operation into a corresponding data element position of the packed data destination operand; and
the execution circuitry to execute the decoded instruction according to the opcode.