US 11,853,975 B1
Contextual parsing of meeting information
Jonathan Alan Leblang, Menlo Park, CA (US); Milo Oostergo, Issaquah, WA (US); Kevin Crews, Seattle, WA (US); Collin Charles Davis, Seattle, WA (US); Yu-Hsiang Cheng, Bothell, WA (US); Aakarsh Nair, Redmond, WA (US); and Richard Christopher Green, Bothell, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Dec. 12, 2017, as Appl. No. 15/839,757.
Int. Cl. G06Q 10/00 (2023.01); G06Q 10/1093 (2023.01); H04L 12/18 (2006.01); G10L 13/08 (2013.01); G10L 15/22 (2006.01); G06F 16/93 (2019.01); G06F 16/245 (2019.01); G06F 40/205 (2020.01); G06F 3/16 (2006.01)
CPC G06Q 10/1095 (2013.01) [G06F 16/245 (2019.01); G06F 16/93 (2019.01); G06F 40/205 (2020.01); G10L 13/08 (2013.01); G10L 15/22 (2013.01); H04L 12/1818 (2013.01); G06F 3/16 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A system comprising:
a database that stores information identifying at least a first voice-enabled functionality and a second voice-enabled functionality that are each configured to be invoked during any of a plurality of meetings, wherein the database further stores (a) identification of a first invocation phrase and a first code module that are both associated with the first voice-enabled functionality and (b) a second invocation phrase and a second code module that are both associated with the second voice-enabled functionality; and
at least one hardware processor in communication with the database, the at least one hardware processor configured to execute computer-executable instructions to:
identify a meeting, wherein the meeting is identified from a calendar entry or a meeting invitation sent to a meeting participant;
extract text data from the calendar entry or the meeting invitation;
identify two or more machine learning models previously trained to determine, from text data, input parameters for respective voice-enabled functionalities identified in the database, wherein the two or more machine learning models include (a) a first machine learning model trained to determine at least one input parameter for the first voice-enabled functionality identified in the database and (b) a second machine learning model trained to determine at least one input parameter for the second voice-enabled functionality;
prior to the meeting and prior to invocation of the first or second voice-enabled functionalities, determine, using respective ones of the two or more machine learning models as applied to the text data extracted from the calendar entry or meeting invitation, contextual meeting information that each of the first code module associated with the first voice-enabled functionality and the second code module associated with the second voice-enabled functionality are configured to receive as input, wherein the contextual meeting information comprises at least one of a discussion topic, an agenda item, or a meeting goal;
store the contextual meeting information, as determined from the text data, as a plurality of labeled fields or variables;
determine, during the meeting, that a voice-capturing device has captured audio data that includes a service-invoking phrase followed by the first invocation phrase that is associated in the database with the first voice-enabled functionality, wherein the service-invoking phrase comprises one or more words indicating that a portion of the audio data following the service-invoking phrase should be interpreted by the system as a request or instruction to the system;
in response to the first invocation phrase being captured in the audio data during the meeting, invoke the first voice-enabled functionality during the meeting by providing at least a portion of the contextual meeting information as input to the first code module associated with the first voice-enabled functionality, wherein the at least a portion of the contextual meeting information provided as input to the first code module is not identified by the system as having been uttered in the audio data, wherein the at least a portion of the contextual meeting information provided as input to the first code module comprises a first labeled field or variable that is identified from among the plurality of labeled fields or variables of the contextual meeting information by matching the first labeled field or variable to a type of a first input slot or argument of the first voice-enabled functionality; and
generate, based at least in part on execution of the first code module when provided with the at least a portion of the contextual meeting information as input, a response to the first invocation phrase to be audibly presented by the voice-capturing device.