US 11,941,397 B1
Machine instructions for decoding acceleration including fuse input instructions to fuse multiple JPEG data blocks together to take advantage of a full SIMD width of a processor
Xiaodan Tan, Mountain View, CA (US); and Paul Gilbert Meyer, Jericho, VT (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on May 31, 2022, as Appl. No. 17/804,796.
Int. Cl. G06F 9/30 (2018.01); G06F 9/38 (2018.01)
CPC G06F 9/30101 (2013.01) [G06F 9/3001 (2013.01); G06F 9/30032 (2013.01); G06F 9/30036 (2013.01); G06F 9/30043 (2013.01); G06F 9/3887 (2013.01)] 20 Claims
OG exemplary drawing
 
5. A computer-implemented method comprising:
retrieving, by an execution unit of a processor, a set of machine instructions from an instruction queue of the processor; and
executing each of the machine instructions on the processor, wherein the set of machine instructions includes a fuse input instruction (FI), the fuse input instruction having a first FI input vector, a second FI input vector, and a FI select input to generate a first FI output vector and a second FI output vector,
wherein the fuse input instruction selects a portion of the first FI input vector and a portion of the second FI input vector based on the FI select input, sign extends the selected portion of the first FI input vector and the selected portion of the second FI input vector, and shuffles data elements of the sign extended portion of the first FI input vector with data elements of the sign extended portion of the second FI input vector to generate the first and second FI output vectors.