US 11,860,975 B2
	Allocating computing resources during continuous retraining
Ganesh Ananthanarayanan, Seattle, WA (US); Yuanchao Shu, Kirkland, WA (US); Tsu-wang Hsieh, Redmond, WA (US); Nikolaos Karianakis, Sammamish, WA (US); Paramvir Bahl, Bellevue, WA (US); and Romil Bhardwaj, Berkeley, CA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Sep. 20, 2022, as Appl. No. 17/948,736.
Application 17/948,736 is a continuation of application No. 17/124,172, filed on Dec. 16, 2020, granted, now 11,461,591.
Prior Publication US 2023/0030499 A1, Feb. 2, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/50 (2006.01); G06N 20/00 (2019.01); G06F 18/214 (2023.01); G06V 20/58 (2022.01)

CPC G06F 18/2148 (2023.01) [G06F 9/5038 (2013.01); G06N 20/00 (2019.01); G06V 20/58 (2022.01)]

20 Claims

17. A computing device configured to be located at a network edge between a local network and a cloud service, the computing device comprising:

a processor; and

a memory storing instructions executable by the processor to perform continuous retraining and operation of a machine learning model configured to analyze one or more video streams, the continuous retraining and operation comprising a plurality of jobs including, for each video stream of the one or more video streams, a retraining job and an inference job, wherein the instructions are executable to:

receive a video stream;

during a retraining window, obtain a labeled retraining data set for a selected portion of the video stream comprising labels for one or more objects identified in the selected portion of the video stream;

select one or more of a configuration for the machine learning model and a computing resource allocation for the plurality of jobs by testing one or more of a plurality of configurations and a plurality of computing resource allocations using an average inference accuracy over the retraining window as a testing metric; and

operate the machine learning model using the one or more of the configuration and the computing resource allocation selected.