US 11,748,098 B2
	Adler assist instructions
Ali Sazegari, Los Altos, CA (US); and Chris Cheng-Chieh Lee, San Jose, CA (US)
Assigned to Apple Inc., Cupertino, CA (US)
Filed by Apple Inc., Cupertino, CA (US)
Filed on May 5, 2021, as Appl. No. 17/308,738.
Prior Publication US 2022/0357947 A1, Nov. 10, 2022
Int. Cl. G06F 9/30 (2018.01); G06F 11/10 (2006.01)

CPC G06F 9/30036 (2013.01) [G06F 11/1004 (2013.01)]

15 Claims

1. A processor comprising:

a register file comprising a plurality of vector registers; and

an execution core coupled to the register file, wherein the execution core is configured to execute a set of checksum instructions, wherein a first checksum instruction from the set of checksum instructions to specify a first vector operand, a second vector operand, and a result vector operand, wherein the first vector operand is in a first vector register of the plurality of vector registers, the second vector operand is in a second register of the plurality of vector registers, and the result vector operand is to be written to a third vector register of the plurality of vector registers, wherein to execute the first checksum instruction, the execution core is further configured to:

accumulate bytes from the first vector operand and the second vector operand into a first portion of the result vector operand and add the accumulated bytes from the first vector operand and the second vector operand to a previously generated value for a second portion of the result vector operand to generate the second portion written to the result vector operand, and wherein the execution core is further configured to:

execute the first checksum instruction consecutively with a block from a set of consecutive blocks from a data source, wherein each block from the set of consecutive blocks has a defined number of bytes, and wherein the execution of the first checksum instruction with the block generates the result vector operand; and

execute a second checksum instruction, wherein the execution core is further configured to:

specify a third vector operand, a fourth vector operand, and a second result vector operand, wherein the third vector operand is in a fourth vector register of the plurality of vector registers, the fourth vector operand is in a fifth register, and the second result vector operand is to be written to a sixth vector register of the plurality of vector registers, wherein the execution core is configured to:

multiply a first portion of vector elements of a third vector operand by at least one vector element of a fourth vector operand to generate a vector written to the second result vector operand;

shift the second result vector operand by a defined value, wherein the defined value is computed to prevent overflow; and

multiply the second result vector operand by at least one vector element of the fourth vector operand to generate a subtraction value and subtracting the second result vector operand by the subtraction value.