US 12,235,931 B2
	Methods for training and analysing input data using a machine learning model
Adrian Bulat, Staines (GB); and Georgios Tzimiropoulos Tzimiropoulos, Staines (GB)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Apr. 25, 2022, as Appl. No. 17/728,281.
Application 17/728,281 is a continuation of application No. PCT/KR2022/001223, filed on Jan. 24, 2022.
Claims priority of application No. 20210100140 (GR), filed on Mar. 8, 2021; and application No. 2116856 (GB), filed on Nov. 23, 2021.
Prior Publication US 2022/0284240 A1, Sep. 8, 2022
Int. Cl. G06N 3/04 (2023.01); G06F 18/214 (2023.01); G06N 3/08 (2023.01)

CPC G06F 18/2148 (2023.01) [G06N 3/04 (2013.01); G06N 3/08 (2013.01)]

9 Claims

1. A computer-implemented method for analysing input data on a device using a trained machine learning, ML, model comprising a plurality of neural network layers, the method comprising:

receiving at least one input data item for analysis;

independently selecting a quantisation level for each of the plurality of neural network layers at runtime;

analysing the received input data item using the selected quantisation levels;

storing first configuration data comprising a selection of quantisation levels;

generating a plurality of items of second configuration data from the first configuration data by introducing noise into the first configuration data;

calculating a latency associated with each of the plurality of items of second configuration data; and

selecting an item of second configuration data of the plurality of items of second configuration data having a lowest latency,

wherein the trained ML model comprises a transitional batch-normalisation layer disposed between a first neural network layer and a second neural network layer of the plurality of neural network layers,

wherein the transitional batch-normalisation layer is configured to compensate for a change in feature distribution between a quantisation level of the first neural network layer and a quantisation level of the second neural network layer.