CPC G06F 3/167 (2013.01) [G06F 3/0482 (2013.01); G06N 3/044 (2023.01); G06N 3/08 (2013.01); G06T 11/001 (2013.01); G10L 15/08 (2013.01); G10L 2015/088 (2013.01); G10L 15/16 (2013.01)] | 14 Claims |
1. A method comprising:
displaying, on a display of a device, a first user interface, a second user interface, and a third user interface of a single application that are active on the device, the first user interface comprising a post user interface of an ephemeral message, the second user interface comprising a page user interface of a non-ephemeral message, the third user interface comprising an image capture user interface;
in response to the first user interface, the second user interface, and the third user interface being displayed, storing, in a memory of the device, audio data generated from a transducer of the device;
identifying a first machine learning scheme corresponding to the first user interface, the second user interface, and the third user interface, a second machine learning scheme corresponding to the second user interface and the third user interface, a third machine learning scheme corresponding to the second user interface, the first machine learning scheme comprising a first machine learning model that is trained to detect a first set of keywords in the audio data, the second machine learning scheme comprising a second machine learning model that is trained to detect a second set of keywords in the audio data, the third machine learning scheme comprising a third machine learning model that is trained to detect a third set of keywords in the audio data, wherein the first machine learning scheme comprises a global model, the second machine learning scheme comprises a multi-screen model, and the third machine learning scheme comprises a page model;
detecting, using the first machine learning scheme, the second machine learning scheme, and the third machine learning scheme, a portion of the audio data as one of the first set of keywords, one of the second set of keywords, or one of the third set of keywords; and
in response to detecting the portion of the audio data as one of the first set of keywords, one of the second set of keywords, or one of the third set of keywords, displaying user interface content pre-associated with one of the first set of keywords, one of the second set of keywords, or one of the third set of keywords.
|