US 12,299,412 B2
	Multiple-input floating-point processing with mantissa bit extension
Thomas Ferrere, Hertfordshire (GB)
Assigned to Imagination Technologies Limited, Kings Langley (GB)
Filed by Imagination Technologies Limited, Kings Langley (GB)
Filed on Aug. 17, 2021, as Appl. No. 17/404,868.
Claims priority of application No. 2012832 (GB), filed on Aug. 17, 2020.
Prior Publication US 2022/0050665 A1, Feb. 17, 2022
Int. Cl. G06F 7/485 (2006.01); G06F 5/01 (2006.01); G06F 7/499 (2006.01)

CPC G06F 7/485 (2013.01) [G06F 5/012 (2013.01); G06F 7/49915 (2013.01); G06F 7/49936 (2013.01)]

19 Claims

1. A method of processing a set of ‘k’ floating-point numbers to perform addition and/or subtraction, k≥3, using a hardware logic implementation, each floating-point number comprising a mantissa (m_i) and an exponent (e_i), wherein the method comprises:

receiving, by a format conversion circuit in the hardware logic implementation, the set of ‘k’ floating-point numbers in a first format, each floating-point number in the first format comprising a mantissa (m_i) with a bit-length of ‘b’ bits;

creating, by the format conversion circuit in the hardware logic implementation, a set of ‘k’ numbers (y_i) based on the mantissas of the ‘k’ floating-point numbers, the numbers (y_i) having a bit-length of ‘n’ bits obtained by adding both extra most-significant bits and extra least-significant bits to the bit-length ‘b’ of the mantissa (m_i), wherein the ‘n’ bits comprises a number of magnitude bits, wherein ‘n’ is b+┌log₂(k)┐+┌log₂(k−1)┐+x bits, where x is an integer, and x≥1 and x≤3, wherein adding the extra most-significant bits comprises adding at least ┌log₂(k)┐ number of most-significant bits;

identifying, by a maximum exponent detection circuit in the hardware logic implementation, a maximum exponent (e_max) among the exponents e_i;

aligning, by an alignment circuit in the hardware logic implementation, the magnitude bits of the numbers (y_i) based on the maximum exponent (e_max); and

processing, by a processing circuit in the hardware logic implementation, the set of ‘k’ numbers (y_i) concurrently;

wherein processing of the set of ‘k’ numbers (y_i) by the processing circuit involves performing a computation that is required to perform a data processing function.