US 12,380,879 B2
Distilling to a target device based on observed query patterns
Matthew Sharifi, Kilchberg (CH); and Victor Carbune, Zürich (CH)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on May 9, 2024, as Appl. No. 18/659,224.
Application 18/659,224 is a continuation of application No. 17/644,427, filed on Dec. 15, 2021, granted, now 11,990,121, issued on May 21, 2024.
Claims priority of provisional application 63/262,465, filed on Oct. 13, 2021.
Prior Publication US 2024/0290324 A1, Aug. 29, 2024
Int. Cl. G10L 15/06 (2013.01); G10L 15/01 (2013.01); G10L 15/065 (2013.01); G10L 15/18 (2013.01); G10L 15/26 (2006.01); G10L 15/30 (2013.01)
CPC G10L 15/065 (2013.01) [G10L 15/01 (2013.01); G10L 15/063 (2013.01); G10L 15/18 (2013.01); G10L 15/26 (2013.01); G10L 15/30 (2013.01)] 20 Claims
OG exemplary drawing
 
11. A system comprising:
data processing hardware; and
memory hardware in communication with the data processing hardware and storing instructions that when executed on the data processing hardware cause the data processing hardware to perform the operations comprising:
receiving a distilled model identified for execution on a target client device, the distilled model related to a corresponding cloud-based model;
receiving at least one of memory constraints or processing constraints of the target client device;
selecting a model configuration for the distilled model based on the at least one of the memory constraints or the processing constraints of the target client device;
processing, using the distilled model having the selected model configuration, an evaluation data set to generate evaluation results indicating an accuracy of the distilled model; and
based on the evaluation results indicating the accuracy of the distilled model, deploying the distilled model having the selected model configuration to the target client device.