US 11,789,753 B2
Machine-learned models for user interface prediction, generation, and interaction understanding
Srinivas Kumar Sunkara, Mountain View, CA (US); Xiaoxue Zang, Santa Clara, CA (US); Ying Xu, Bellevue, WA (US); Lijuan Liu, San Jose, CA (US); Nevan Holt Wichers, Mountain View, CA (US); Gabriel Overholt Schubiner, New York, NY (US); Jindong Chen, Hillsborough, CA (US); Abhinav Kumar Rastogi, Sunnyvale, CA (US); Blaise Aguera-Arcas, Seattle, WA (US); and Zecheng He, Princeton, NJ (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Jun. 1, 2021, as Appl. No. 17/335,596.
Prior Publication US 2022/0382565 A1, Dec. 1, 2022
Int. Cl. G06F 9/44 (2018.01); G06F 9/451 (2018.01); G06N 20/00 (2019.01); G06F 18/214 (2023.01); G06F 18/2135 (2023.01); G06N 3/045 (2023.01)
CPC G06F 9/451 (2018.02) [G06F 18/214 (2023.01); G06F 18/21355 (2023.01); G06N 3/045 (2023.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for training and utilization of machine-learned models for user interface interaction understanding, comprising:
obtaining, by a computing system comprising one or more computing devices, interface data that comprises a sequence of two or more user interfaces obtained through performance of one or more user interactions which result in generation of the sequence of two or more user interfaces, wherein, for each user interface in the sequence of two or more user interfaces, the interface data comprises one or more interface images depicting the user interface;
determining, by the computing system, a plurality of intermediate embeddings based at least in part on the interface data;
processing, by the computing system, the plurality of intermediate embeddings with a machine-learned interface prediction model to obtain one or more user interface embeddings; and
performing, by the computing system, a pre-training task based at least in part on the one or more user interface embeddings to obtain a pre-training output.