US 12,487,903 B2
Automatic generation of computation kernels for approximating elementary functions
Daniel Khankin, Beer Sheva (IL)
Assigned to Next Silicon Ltd, Givatayim (IL)
Filed by Next Silicon Ltd, Givatayim (IL)
Filed on May 2, 2024, as Appl. No. 18/652,846.
Application 18/652,846 is a continuation of application No. 17/569,566, filed on Jan. 6, 2022, granted, now 12,001,311.
Prior Publication US 2024/0289248 A1, Aug. 29, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 11/34 (2006.01); G06F 11/30 (2006.01); G06F 17/17 (2006.01)
CPC G06F 11/3433 (2013.01) [G06F 11/3089 (2013.01); G06F 11/3452 (2013.01); G06F 17/17 (2013.01)] 22 Claims
OG exemplary drawing
 
1. An apparatus for computing functions using polynomial-based approximation, comprising:
at least one hardware processing circuitry configured for:
computing a polynomial-based approximant approximating a function by executing at least one iteration, comprising:
computing the polynomial-based approximant by using at least one scaled fixed-point unit of said at least one hardware processing circuitry, and without using floating-point units of said at least one hardware processing circuitry, according to a constructed set of coefficients,
minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with a constraint of a hardware utilization of the at least one hardware processing circuitry and complying with at least one additional constraint selected from a group comprising at least: an accuracy defined by said approximation error, a compute graph size, and a computation complexity,
adjusting at least one of the coefficients of the set of coefficients in case the polynomial-based approximant is incompliant with the constraint and with the at least one additional constraint and initiating another iteration;
outputting the polynomial-based approximant with the adjusted set of coefficients for which the computed polynomial-based approximant complies with the constraint and with each of the at least one additional constraint to at least one hardware processing circuitry; and
configuring said at least one hardware processing circuitry to approximate the function by computing the polynomial-based approximant using the adjusted set of coefficients;
wherein the at least one hardware processing circuitry comprises a plurality of logical hardware elements realizing the at least one scaled fixed-point unit to provide improved efficiency in computing the polynomial-based approximant with reduced computational complexity compared to floating-point implementations.